Skip to content

ehrQL and cohort-extractor🔗

Missing features from cohort-extractor🔗

  • Many features of cohort-extractor are not yet implemented in Data Builder.
    • The current development approach is to implement a few features in Data Builder fully end-to-end.
    • See the ehrQL reference for a complete list of supported features.
  • ehrQL has no current way to generate dummy data.

The development plan for cohort-extractor🔗

cohort-extractor will continue to be supported by OpenSAFELY while Data Builder is in this initial design phase.

Once ehrQL is ready for general use, cohort-extractor will continue to be maintained, where possible, so that ongoing OpenSAFELY studies can continue to be run.

However:

  • New features are likely to only be added to ehrQL.
  • It may become infeasible to support cohort-extractor if the currently supported data backends undergo considerable change.

More detail for existing cohort-extractor users

The dataset definition used by ehrQL has the same underlying purpose as cohort-extractor's study definition.

To extract data, an OpenSAFELY research study would typically use one of:

Dataset definitions have a considerably different structure from the study definitions. You will need to refer to the new language to write a dataset definition.

Cohorts are now referred to as datasets. This accommodates the possibility of handling other types of data, other than purely patient data.

The main researcher-facing change with the introduction of ehrQL is the new language for extracting datasets. ehrQL does not affect the rest of the structure of an OpenSAFELY project.