In general, data veracity is defined as the accuracy or truthfulness of a data set. In many cases, the veracity of the data sets can be traced back to the source provenance. In this manner, many talk about trustworthy data sources, types or processes. However, when multiple data sources are combined, e.g. to increase variety, the interaction across data sets and the resultant non-homogeneous landscape of data quality can be difficult to track.
As the Big Data Value SRIA points out in the latest report, veracity is still an open challenge of the research areas in data analytics.
Content validation: Implementation of veracity (source reliability/information credibility) models for validating content and exploiting content recommendations from unknown users;
It is important not to mix up veracity and interpretability. Even with accurate data, misinterpretations in analytics can lead to the wrong conclusions. However, this is in principle not a property of the data set, but of the analytic methods and problem statement.