September 17, 2022
The sheer volume, variety, and velocity of data in this modern era have enabled significant advancements in many research areas. However, the advancements in the research community thanks to Big Data do not necessarily translate to the benefit of society; of ordinary people living ordinary lives. There is indeed a gap between breakthroughs in the […]
Read moreMay 10, 2022
Financially, poor data quality costs organizations some ludicrous amounts of money. Worse, poor data quality is a strong inhibitor to the success of data science: No analytical method can create value from poor quality data. As a consequence, data science projects invest a majority of their resources on cleansing data. However, cleansing resists automation as […]
Read moreJune 26, 2020
Data visualization and analytics are nowadays one of the cornerstones of Data Science, turning the abundance of Big Data being produced through modern systems into actionable knowledge. Indeed, the Big Data era has realized the availability of voluminous datasets that are dynamic, noisy and heterogeneous in nature. Transforming a data-curious user into someone who can […]
Read moreMarch 20, 2020
Data visualization and analytics are nowadays one of the cornerstones of Data Science, turning the abundance of Big Data being produced through modern systems into actionable knowledge. Indeed, the Big Data era has realized the availability of voluminous datasets that are dynamic, noisy and heterogeneous in nature. Transforming a data-curious user into someone who can […]
Read moreApril 18, 2018
When dealing with real-world data, dirty data is the norm rather than the exception. We continuously need to predict correct values, impute missing ones, and find links between various data artefacts such as schemas and records. We need to stop treating data cleaning as a piecemeal exercise (resolving different types of errors in isolation), and […]
Read more