Wednesday, February 7, 2018

Learn to code?

Also in this week's C&EN, this interesting suggestion from Professor Javier García Martínez: 
...However, for this opportunity to be fully realized, chemists should be able to talk to machines. Unfortunately, few chemists can actually code, let alone program a robot or write an algorithm to design and run a better set of experiments. Robotics or AI are rarely part of the chemistry curriculum, even at graduate school. This is especially worrisome considering that a recent report by Dell Technologies estimates that 85% of the jobs that will exist in 2030—when our current students will be in their early 30s—have not been invented yet but will definitely require those skills. 
The chemical industry will be profoundly transformed by the convergence of technologies that defines the fourth industrial revolution. According to the World Economic Forum, the digitalization of the chemical industry will create revenues in the $310 billion to $550 billion range, reduce CO2 emissions by 60 million to 100 million metric tons, and avoid 2,000 to 3,000 injuries over the next decade. This will require profound adaptations—and on a very short notice—in the workforce, leadership, and organization of a $5 trillion industry.
Call me skeptical that one more item should be bolted onto graduate training in chemistry, but learning to code or work with AI technologies seems like a reasonably wise thing for a graduate student in the sciences to do. 


  1. We have been taught basic algorithm mind and fortran coding during our undergrad years in early 2000s. It would have been much more useful to become much more experienced in coding rather than spending hundreds of wasted hours for replying the stupid questions of refereeing to overcome "publish or perish."

  2. Before coding even, I recommend lots of statistics and familiarity with common stat packages. Follow this up with visualizations of data. Automation removes the bottle neck of experimental execution, hence planning your experiments and understanding the mountains of multivariate data are the two most important skills. Most of the medical sciences and engineering emphasize this, but no chemistry curriculum does. A "nice to have" would be stat languages, like R and MatLab along with some familiarity with coding in Python. Actually programming a robot is less important. If you know what you want it to do (i.e. be creative and develop a use case), just work with the hardware vendor or a company good with UI design (and data integration expertise).

  3. Over the past several years in my job I have learned programming in Python (and to a lesser extent, R, C, C++). It is a regret that I did not learn this in undergraduate, because it would have saved me a huge amount of time in grad school and work.

    To me, the benefits of programming are:
    (1) unlike something like Excel (where the formulas are hidden in cells), the source code is itself a document of what steps you used, so if you suspect a mistake or want to share your method with others, it is far better documented.

    (2) I use Python a lot to process data. I get a big bang out of batch processing at times, which probably saves me about 30 min a day on some projects, once I have the batch script prepared.

    (3) If I am trying to learn some new math or ways to process data (say background correction), I can fairly quickly throw together something to see if I can get the same results as my reference (say I am following a paper) and eventually re-use some of the same code if I decide to use that method.

  4. I made this comment last year when the same topic/question was raised here, so I'll copy what I said:

    “My two cents:

    It’s not just that “chemistry majors should learn to code”; I feel that all college graduates today should learn to code. Programming is becoming a fundamental type of literacy these days. Just like how all college graduates should be fully literate in English and have some exposure to mathematics (e.g. calculus), all graduates should also have some experience with coding or programming.

    As to how to incorporate programming into a typical undergraduate chemistry curriculum – I’m not entirely sure. Like a lot of people here, I took a required course as an undergrad on Matlab programming, after which I promptly forgot everything, since we never used it again. My PhD work in synthetic organic chemistry also involved zero programming, and other organic chemists here will probably also have similar experiences. In organic chemistry, programming is one of those things that is nice to know, but not at all necessary for success, and may even be viewed as somewhat of a distraction – is knowing how to program in Java going to get you better separations in your columns? Not really.

    Everything I know about programming came AFTER I finished my PhD – I self-taught programming with online courses, starting with Codecademy, and after I felt I had reached a decent level of competency, I enrolled in a “Data Science” bootcamp last year. Everything I learned was completely orthogonal to chemistry; there’s little overlap between training and running a machine learning model using Python/scikit-learn and being able to do asymmetric oxidations at -78 C.

    If you’re doing computational chemistry, then sure, knowing fundamental programming and CS is incredibly important. In experimental synthetic chemistry…I’m not so sure. My academic experiences have proved that programming has limited utility in chemistry. I think it’s time for this part of chemistry to catch up to the modern age as well. Like Anon 3:15 PM says, if you can type print(‘Hello World!’) into a Python interpreter, then congratulations – you know more programming than 99% of organic chemists. But you also know less programming than 100% of professional developers."

  5. I work at a small coatings/adhesives/sealants company. Most of the work involves formulating and characterizing materials, but using a robot is part of the job. The formulations have to be pumped out a nozzle. The nozzle is attached to a robotic arm that is programmed to place the dispensed formula where it needs to go. The rheological properties need to be optimized for the robotic process, and the robotic process in turn needs to be optimized for the rheological properties. It's a tug-of-war, but the point is that robotics are essential. Customer parts and requirements vary, so you have to be able to re-program. The current crop of academics will not be interested. I have tried many times to get into academia, arguing that the added benefit is industrial knowledge and appropriate research projects (bringing chemistry students research projects that will involve skills/knowledge appropriate to industry including robotic programming), but I never get interviewed despite having over 30 publications, experience obtaining DOE grants and two post-docs (one with a nobel laureate).

  6. All ACS undergraduate chemistry programs should require students to take C++/FORTRAN and Python programming courses. At my undergrad, every chemistry major had to take a programming course (I took C++). I taught myself MatLab for grad school, but I wish I took more programming courses in undergrad and grad school.

  7. Since I am lazy, I usually get labmate write python scripts for me and put him as mid author on my papers. But there are really a lot of ways in which python can accelerate data analysis any time you are exporting a CSV file with a bunch of values from an instrument. Like it can save me up to several hours compared to doing a comparable analysis on excel if I'm doing kinetics.

  8. I agree that coding is a useful skill (that I don't have or plan to learn) for chemists, but statements like "recent report by Dell Technologies estimates that 85% of the jobs that will exist in 2030—when our current students will be in their early 30s—have not been invented yet" tend to lose me.

    Perhaps I'm being cynical, but looking back at predictions about the future is always amusing:

    I'm disappointed we still don't have our base on the moon, though maybe Space: 1999 was not fictional.... WHERE IS MY FLYING CAR??????

    1. "Known soothsayers Dell Technologies"

    2. From that article:

      "When incomes double... some of the population will prefer two 20-hour-week jobs, to keep busy and increase income. Others may prefer to work six months and play six.

      "Still others may prefer to work very hard for 10 or 15 years so that they can retire at 40 and indulge in other activities for the rest of their lives.

      "The professions would never get any rest. The very nature of their work makes them completely dedicated to it."

      Beam me up to moon-base alpha where Catherine Schell will greet me, please.

    3. I have to agree with you. If the Dell Tech person really means that 85% of all jobs in 2030 will require the ability to personally program a robot to do your work then I'd expect the unemployment rate to hover some where around 60% by then. If I'm not mistaken, I think under half of Americans currently even have a college degree and now we are say thing the vast vast majority of future jobs are going to be computer science jobs in just under 12 years. Nope I don't buy it.

  9. I spent 36 years in the analytical division of a once iconic company. I quickly learned that automation was the key to reproducible results. Happily, most of the software I depended upon could be extended by the user. In the later years of my career I increasingly relied on R and Python (including Jython for code that needed to interface with Java.) I have benefitted greatly from the ImageJ/Fiji, R-stats, Python, GitHub, and Software/Data Carpentry communities. It has become easier to integrate multiple languages into language-agnostic documents using Jupyter notebooks and RMarkdown documents. The next generation of chemists will need to be motivated, self-directed learners. Training budgets were one of the first casualties of the never-ending cost-reduction cycle.

  10. For a synthetic chemist I would categorize coding as a 'nice to know' skill. I'm a little skeptical that it's going to play a big role in an employers decision to hire somebody. Mainly because for other 'nice to know' skills employers generally expect synthetic chemists to learn them on the job. The main example being medicinal chemistry. If you work as a synthetic chemist in the drug industry you'll spend a whole lot of time on med chem stuff, not just synthesizing compounds. However, employers still seem to always prefer new grads who are great synthetic chemists with no med chem knowledge over a good synthetic chemists with good med chem knowledge. I don't see why it would be any different for a synthetic chemistry job that required some coding knowledge.

  11. As an undergrad in the 70's my freshman calculus class required I evaluate integrals numerically using a BASIC program... This was in the era of punch cards.

    On a Cop-op job as an undergrad I had to code an HP desktop "calculator" in BASIC that had a digitizer to calculate molecular weight distubutions from GPC data

    As an undergrad I also took a beginning fortran course as an elective.

    I thought myself Pascal when that was in vogue.

    I have done some coding on very job I have had in my career... in some cases my apps were used by most of the technical staff.. Nothing very sophisticated but great time savers!

    When I was in school I never head of R or Python or MatLab (not sure when they came into existence). The places I've worked were production and development oriented not academic.

    I never could get comfortable with C or C like languages... The concepts were not the issue... I just found the syntax and hard to read and unintuitive...

    But about 17 years ago I found an XPlatform (mac/Win/Linux ) RAD product using a true object oriented language with BASIC like syntax (unlike VB). X-Platform was big deal as every place I've work for the last 20+ years was both Mac and Windows.

    Anyway coding has been very helpful for me over the years (I am still in the lab BTW)

    Typically I was the only chemist with some coding skills. The others that had them tended to be engineers.

  12. I have zero use for coding as an organic, synthetic chem PhD. There are a few times when I think about it being useful (i.e. writing custom drivers and control software for various bits of equipment and databases for organising spectra and documentation) but I'm sure that 99% of the time it would just be an interesting distraction from my real work.

    That said, I do have some experience with code, but for personal projects unrelated to chemistry such as proxy servers and web-based plugins. Sadly these personal projects tend to distract me from my PhD work...

  13. I work as a data scientist at a somewhat successful biotech startup that uses robotics to synthesize RNA for CRISPR. While most of the software team has science backgrounds, they all know how to code. Even the few people who are hired as chemists have some coding experience. When they hired me, apparently they had candidates with impressive credentials and backgrounds (Chemistry and Biology PhDs from Ivy League/Stanford), but no one actually knew how to code well besides me, so I got the job. Granted, this is a start up, so who knows if it will continue to be successful, but I think it tells you something about what the future holds.

  14. Knowing how to code helped me somewhat to get my job in the first place (synth org chemist in a large corporation), because they were looking for a jack of all trades (org chem + phys chem + dft). But when it came really handy was when I started to carve a niche for myself in the organisation - the combination of chemistry, physics, data science and coding skills enabled me to develop tools which made the life of my fellow lab workers and mid-managers easier, but were completely outside of our IT department‘s imagination.