Data Engineer Intern
Location: CA – Pasadena (open to remote)
We are looking for a Data Engineer Intern to support and improve upon data pipeline processes and enable new uses for our data stores. You will collaborate with our data engineers, data scientists, data analysts and product stakeholders to implement processes and infrastructure that support our data driven reports and analytics. These systems process billions of location data points per day.
Daily work may involve:
- Monitor and troubleshoot operational or data issues in the data pipelines.
- Assemble large, complex data sets for client samples.
- Craft regular scheduled data feeds by extracting and transforming data from various sources.
- Query data stores for report generation and analytics.
- Document best practices and organize internal technical collateral.
- Setting up and updating data stores.
- Learning to manage client requests and go through an agile iterative cycle of requirement gathering and report building.
- Deeper knowledge of location data including but not limited to geofencing, polygon filling, contact tracing
Knowledge / Background
- Familiarity with UNIX/Linux environment including basic commands and shell scripting
- Education: While a Bachelor’s degree or above in Computer Science, Applied Mathematics, Engineering, or any other technology related field is great, an equivalent of this educational requirement in working experience is also accepted for this position.
- Interest in big data and data stores (Spark, HIVE, or Hadoop)
- Cooperative team spirit with strong attention to detail
- Ability to break down complex problems into manageable steps
- Ability to build rapport with product, project, and QA teams
- Excellent critical thinking, problem solving, mathematical skills and sound judgment
- Experience with scripting languages especially Python
- Exposure to Cloud-based platforms like Amazon Web Services, Azure or Google Cloud Platform
- Exposure to Data visualization tools like Tableau or Power BI
- Familiarity with relational databases and query authoring (SQL)
- Interest in EMR, Athena
- Interest in GIS
- Nontechnical – requirements gathering, Kanban development practices, agile delivery
- Technical languages of processing data: SQL, Python, Pyspark
- Frameworks – GIS, EMR, Athena, and other misc parts of AWS (EKS, RDS, etc.)
- Best practices for data management including database modeling, ETL development, ETL coordination
Hours per week: 10-40 hours
Pay Rate: $25-27/hour
UM provides the highest quality mobile data solutions trusted by businesses to creatively solve their persistent challenges. The company’s diverse suite of products process billions of social, demographic, and location signals daily for Fortune 500 companies across retail, travel, and entertainment to better understand and influence modern consumers with the most accurate business decision science. Recognized as a pioneer in targeted mobile advertising, UM was listed as Fast Company’s “50 Most Innovative Companies,” The Wall Street Journal’s Top “50 Startups,” Entrepreneur Magazine’s “Best Entrepreneurial Companies in America,” and as one of Advertising Age’s “Best Places to Work.” UM is headquartered in Pasadena, CA