Data Engineer

Skokie, Illinois, US
May 07, 2021
May 31, 2021
Climate Change
Employment Type
Full time
Salary Type
About LanzaTech:

Mission-driven LanzaTech is a revolutionary solution to climate change, capturing pollution to clean our skies and oceans and create new high value products for a sustainable future. LanzaTech captures pollution to make fuels and chemicals, including sustainable jet fuel and products you use every day. The technology is like retrofitting a brewery onto an emission source like a steel mill or a landfill site, but instead of using sugars and yeast to make beer, pollution is converted by biology to fuels and chemicals! Imagine a day when your plane is powered by recycled emissions, when your yoga pants started life as pollution from a steel mill. This future is possible using LanzaTech technology.
  • 2021 Biofuels Digest #1 Hottest Company in Renewable Fuels, Chemicals & Biomaterials
  • 2020 Biofuels Digest #3 Best Company to work for in the Advanced Bioecomony
  • 2020 Fast Company World Changing Company
  • 2020 CNBC #43 on Top 50 Disrupter Companies list
  • 2019 Chicago Innovation Awards Winner
  • 2019 Fortune Magazine - Change the World - Companies to watch
  • 2019 Kirkpatrick Chemical Engineering Achievement Award
  • LanzaTech is committed to Diversity, Equity & Inclusion as part of our mission, culture & core values

This position is open to candidates authorized to work in the United States on a full-time basis for any employer. LanzaTech is an Equal Employment Opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

About the role
  • We have an exciting new opportunity for a data engineer to join the Computational Biology group at LanzaTech.
  • You will explore data from our science and engineering teams, engineer complex datasets, and build robust and optimized data pipelines that power our AI/ML workflows.
  • You will develop and operate these data pipelines, working closely with our modeling team to solve their data requirements. You will also analyze data and provide support on data quality and validation.
  • This role involves collaboration with other departments in a fast-paced scientific organization. Your work will directly contribute to the success of LanzaTech's carbon recycling vision by allowing us to utilize valuable data generated across the company.
  • This will begin as a fully remote position, with the possibility of ongoing remote or flexible work. But you must live in the Chicago/Skokie area or be willing to relocate when needed.

Key duties
  • Understand, analyze and visualize data generated by LanzaTech's science and engineering teams
  • Measure, monitor and report on data quality and provide feedback to relevant teams
  • Build data pipelines for analytical usage and modeling
  • Work with data labelling teams and monitor quality of annotations
  • Work with data warehouse team to improve ETL processes
  • Clean and wrangle data
  • Build and publish standardized internal data sets
  • Work with team members to help them understand data
  • Drive continuous improvements to data collection and storage
  • Design, implement and administer databases
  • Write well-structured, documented and tested code in Python
  • Closely collaborate with researchers on data products and AI/ML projects

Required skills
  • A bachelor's degree in Computer Science, or equivalent experience and professional certifications
  • Intermediate Python development skills
  • Extensive experience engineering data for analytics and AI/ML use cases
  • Experience with analysis, design, optimization and development of database queries and ETL processes
  • Experience in database design, implementation and administration (SQL)
  • Familiarity with statistics methods and relevant packages
  • Experience building interactive data visualizations (Tableau, Plotly, D3)
  • Strong analytical and problem-solving skills, and excellent attention to detail
  • The ability to be proactive and take ownership of their work
  • Clear oral and written communication skills

Nice to have experience

**We are especially excited about candidates who have experience with:
  • Workflow management tools (Luigi, Airflow)
  • Scientific computing and analysis packages (NumPy, SciPy, Pandas)
  • Cloud based technologies and databases (especially AWS and GCP)
  • DevOps (CI/CD pipelines, containerization, deployment automation)
  • Machine learning libraries (TensorFlow, PyTorch, XGBoost, scikit-learn)
  • RESTful APIs
  • Scientific data
  • Biological and time series data
  • Computational biology, microbiology or fermentation

Powered by JazzHR

Similar jobs

Similar jobs