By Mike Jackson, Software Architect
As part of my open call consultancy for LUX-ZEPLIN (LZ), I looked at how LZ could migrate their data and computation from Excel to MongoDB and Python. There are many resources with valuable advice on cleaning data in Excel into a form suitable for analysis using Python, R or other data analysis packages. Unfortunately, how to handle formulae and cross-references is little discussed.
Based on my experiences, I have written a guide on “Tips for porting formulae from Excel into code” in which I provide some (hopefully) helpful hints on how to identify and highlight formulae and cross-references, which can help when porting these to Python or R, and to restructure tables so that raw data is contiguous, and so is easy read by data analysis packages or to export into a database or files. Feedback, suggestions and additional advice is more than welcome.
Feel free to add these as comments!