Peter Tenenbaum
Senior Pipeline Architect
Disciplines: Data Science, Software Engineering, Accelerator Physics
Degree/Major: PhD 1995, University of California, Santa Cruz
Role: Scientist
Biography
My professional life began in the field of particle accelerators. I spent a combined total of seventeen years at the SLAC National Accelerator Laboratory in Palo Alto and the European particle physics laboratory, CERN, in Geneva, Switzerland. In 2008, I took a one year leave of absence from SLAC to work at the SETI INSTITUTE on data analysis software for the Kepler Mission. Seventeen years later, I’m still here!
The Kepler mission was the first large-scale search for transiting exoplanets: it used a space-based telescope to observe a fixed section of the sky for four years, taking images of 170,000 individual stars every thirty minutes. This data allowed us to deduce which of these target stars hosted planetary systems: by detecting the periodic and minuscule dimming of a given star, we could infer the existence of a planet that passed in front of ("transited”) the star, blocking a fraction of its light and thus reducing its brightness for a short interval. The SETI INSTITUTE provided the data analysis system that processed that enormous collection of data and identified the best potential planet signals. My first job on the project was to write software that used observations of the target stars to determine the alignment and orientation of the 42 CCDs that were used as detectors for the telescope. Later I worked on the Transiting Planet Search (TPS) software, which performs the actual detection of potential transiting planets; and then Data Validation (DV), which performs a series of tests that aid in the determination of whether the signal from TPS might be a planet, or whether it was most likely some other kind of signal that mimicked that of a transiting planet.
A few years later, SETI joined the TESS (Transiting Exoplanet Survey Satellite) mission. Like Kepler, TESS uses a space-based telescope to search for planets that transit their host stars. The TESS observing strategy involves gradually surveying the entire sky rather than concentrating on a fixed patch in the manner of the Kepler mission. It uses instruments that provide a much larger field of view than Kepler’s, and is designed to do its best work inspecting stars that are much closer than the Kepler mission’s targets: while a typical Kepler target star might be 2,000 light years away, typical TESS targets are within 300 light years of Earth. My early work on TESS was on replicating many of the software applications pioneered by Kepler, but adapting them to the extremely different TESS instrument and observing strategy, as well as the data volume, which is about 20 times larger than Kepler’s.
After a few years, my work on TESS transitioned from the data analysis / data science software to the software infrastructure that orchestrates the overall process: that is to say, after time spent as an accelerator physicist and as a data scientist, I was now becoming a software engineer. As the Lead Software Engineer (LSE) for the TESS data analysis team, I managed and supervised a small team of software engineers who wrote, tested, and maintained the infrastructure code for TESS, as well as writing a few tens of thousands of lines of code myself. Not bad for somebody whose formal software engineering education is two introductory classes taken at a community college.
As the TESS project matured, and the power of its data analysis pipeline became evident, it occurred to several people that the pipeline contained a central core of software that could be used to orchestrate the operations of any data analysis pipeline, not just the TESS pipeline. Thus began the Ziggy project, the goal of which is to spin off that lowest level of orchestration software, improve upon it, and make it sufficiently generic that it can support any data analysis pipeline. Here again I supervise a (really) small team of software engineers and also get to write a ton of code myself.
Major Awards
- NASA Group Achievement Award (as member of the Ziggy Development Team), 2023.
- NASA Exceptional Public Achievement Medal, 2020.
- NASA Ames Honor Award for Outstanding Technical Contributions and Leadership in the TESS Science Processing Operations Center, 2017.
- NASA Certificate of Appreciation for Exceptional Contribution to the Development of the Kepler Data Analysis Pipeline 2011.
- NASA Ames Honor Award for excellence in the category of Contractor Employee, 2010.
- NASA Software of the Year Award (as member of the Kepler Science Operations Center software development team), 2010.
Links
Alternate contact: ptenenbaum@seti.org
Related Projects
I remain Lead Software Engineer (LSE) for the TESS Science Processing Operations Center (SPOC). The TESS mission is about to enter its third mission extension (EM3). Each mission extension has called for expanded data and science capabilities. The software engineers and data scientists in the SPOC are responsible for adapting the system to meet the new and expanded processing requirements for the mission. We also perform “sustaining engineering” (i.e., fix things when they break) for the entire system.
I’m also the Senior Pipeline Architect for the Ziggy project. Ziggy is available as open-source software: check out https://github.com/nasa/ziggy. Ziggy has been vastly improved upon and expanded to support a wide variety of data analysis missions, particularly missions with extreme data volumes.
An example of a mission with extreme data volumes is the GOES earth-observing satellites: each of the satellites (GOES-16 and GOES-18) produces about 1 terabyte (TB) of data per day. I’m working with the NASA Earth Exchange (NEX) team to build a data analysis pipeline, based on Ziggy, that automates processing of this data.
In a nice example of, “what goes around, comes around,” the TESS team is currently adapting its data analysis pipeline to run under Ziggy! This allows the TESS pipeline to take advantage of all the improvements made to Ziggy relative to the original TESS pipeline infrastructure. This project is close to code-complete, and we are currently performing data volume tests and other acceptance trials.