Responsibilities
Develop and manage end-to-end data pipelines, covering data ingestion, transformation, quality assurance, and integration to support enterprise analytics solutions.
Partner with solution design and business stakeholders to define data needs and compile complex datasets aligned with business objectives.
Design, implement, and optimize analytics solutions to meet both technical and functional requirements.
Collaborate with data architects to maintain consistency and integrity of data models.
Build and maintain scalable infrastructure for efficient data extraction, transformation, and loading (ETL) across diverse data sources.
Develop tools and frameworks to enable data analysts and data scientists to efficiently build and enhance data models.
Work closely with DevOps teams to ensure stable and reliable deployment and operation of data platforms.
Collaborate with data analysts and scientists to design and develop APIs supporting analytics and AI use cases.
Requirements
Bachelor's or Master's degree in Computer Science, Information Technology, or a related discipline.
Minimum 3 years of experience in SQL/PostgreSQL, data engineering, and BI solutions, including integration with third-party systems.
Hands-on experience with technologies such as ERP systems, Spark, Scala, Python, SQL scripting, relational databases (e.g., data warehouses), NoSQL platforms (e.g., MongoDB, Cassandra), and cloud platforms (e.g., Azure).
Proven experience with modern data platforms and tools such as Data Lake, Databricks, Data Factory, and BI dashboard development.
Strong track record in handling large-scale, complex datasets and building end-to-end pipelines on both on-premise and cloud environments.
Proficiency in coding for data management, data warehousing, and unstructured data processing.
Experience in energy or other asset-heavy industries is advantageous.
Familiarity with developer productivity tools such as GitHub Copilot.
Understanding of Generative AI concepts (e.g., RAG), orchestration frameworks (e.g., LangChain, LlamaIndex), and vector databases.
