We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
Data Engineer (Databricks, NLP model development)
Ework Group - founded in 2000, listed on Nasdaq Stockholm, with around 13,000 independent professionals on assignment - we are the total talent solutions provider who partners with clients, in both the private and public sector, and professionals to create sustainable talent supply chains.
With a focus on IT/OT, R&D, Engineering and Business Development, we deliver sustainable value through a holistic and independent approach to total talent management. By providing comprehensive talent solutions, combined with vast industry experience and excellence in execution, we form successful collaborations. We bridge clients and partners & professionals throughout the talent supply chain, for the benefit of individuals, organizations and society.
For our client, one of the Global Pharmaceutical Company, we realize a recruitment process for Data Engineer role.
Role overview: You will combine clinical data expertise with strong data engineering and technical skills to generate well documented pipelines from source to curated data sets in common data models like CDISC SDTM. You will collaborate closely with clinical SMEs, data scientists, infrastructure, and other skilled data engineers. With time, we will be looking to include external benchmarking data as a FounData product and you will help automate extraction and harmonisation of competitor clinical trial data from public registries and publications into structured, analysis ready formats. Key responsibilities
- Productionise and monitor pipelines and models; collaborate on CI/CD, testing, and user feedback.
- Implement ETL patterns (medallion architecture), ensuring data provenance, validation, and versioning.
- Take part of continuous improvement and validation of existing pipelines.
- Ensure clinical concepts are correctly represented and harmonized across data models (CDISC SDTM/ADaM, OMOP, HL7); contribute to mapping and transformation logic.
- Develop NLP models for entity and relation extraction (e.g., inclusion/exclusion criteria, demographics, endpoints, study design).
- Build automated pipelines to ingest registry and publication data and convert to tabular, queryable datasets.
- Co design the benchmarking data model with end users and map extracted information to standardized terminologies.
- Integrate human in the loop review, confidence scoring, and vocabulary/units normalization.
Requirements:
- At least 5 years of experience as Data Engineer
- Strong data engineering skills: Databricks, Spark, Delta Lake, SQL, ETL design and orchestration.
- Familiarity with clinical trial concepts (inclusion/exclusion criteria, endpoints, demographics) and biomedical terminologies.
- Practical experience with data modeling and working with end users to define requirements.
- Experience with CI/CD, testing frameworks, and monitoring for data pipelines and ML models.
- Experience with NLP for information extraction from scientific text (publications, registries).
- Fluency in English both written and spoken
- Nice to have: experience with pharmaceutical sector or clinical research data environments
We offer:
- B2B agreement
- Transparent working conditions with both Ework and the client
- Current support during our cooperation
- Possibility to work in an international environment
- Collaborative environment in Swedish organizational culture
- Private medical care
- Life insurance
- Multisport
- Teambuilding events
Contact person: mateusz.jozefiak@eworkgroup.com
Clietn code: HNN01
Do you know someone who would fit this position? Recommend a candidate by sending her/his CV to: polecenia@eworkgroup.com.
Whistleblowing Policy, which provides guidelines for reporting misconduct can be found on Ework website: https://www.eworkgroup.com/about-us/our-responsibility
- Locations: Remote
- Technologies: Amazon Web Services (AWS), Azure, Continuous Integration / Continuous Deployment, Databricks, ETL, Natural Language Processing, Python, SQL, Spark
- Language: English, Polish