yermatters.blogg.se

Data collection validity and reliability
Data collection validity and reliability





Metadata is the information describing the data, rather than the data itself. In this blog, I will describe validity testing, breakdown the concept of accuracy testing, and review the testing frameworks available. In our case, asking if a data set is OK is equal to asking “Is it valid and accurate?”.

  • Timeliness depends mostly on the engineering pipelined functioning, rather than on the quality of the data.
  • Consistency and integrity are irrelevant since other datasets are not considered.
  • It doesn’t require completeness, as other data sets may compensate.
  • When we look at a single data set, our quality considerations are more narrow: They consider all data sets defined and collected, the relations between them, and their ability to properly serve the organization. These dimensions were defined while taking a wide view of designing a data warehouse.

    data collection validity and reliability

    Integrity: Can different data sets be joined correctly to reflect a larger picture? Are relations well defined and implemented?.Validity (aka Conformity): Is the information in a specific format, type, or size? Does it follow business rules/best practices?.

    data collection validity and reliability

    Timeliness: Is your information available when you need it?.Consistency: Does information stored in one place match relevant data stored elsewhere?.Completeness: Does it fulfill your expectations of what’s comprehensive?.Accuracy: How well does a piece of information reflect reality?.Traditionally data quality is split into 6 dimensions:







    Data collection validity and reliability