Garbage In, Bias Out: Are AI Models or Data Pipelines the Real Problem

Praneeth Kumar Reddy Palampalli; Keerthana Allam

doi:10.65579/31075037.0140

Authors

Praneeth Kumar Reddy Palampalli Author
Keerthana Allam Author

DOI:

https://doi.org/10.65579/31075037.0140

Keywords:

Artificial Intelligence (AI), Algorithmic Bias , Data Bias , Data Pipelines, Machine Learning Ethics, Fairness in AI, Data Governance, Bias Mitigation, Responsible AI, Data Quality, Algorithmic Accountability, Ethical AI Systems

Abstract

The fast introduction of artificial intelligence (AI) into all industries has raised issues about bias, fairness, and responsibility in the use of algorithms in making decisions. Although biased results can be explained by the imperfect AI models, the quality and structure of underlying data pipelines is a less significant but no less important and not studied issue. The principal question of the given research is the following: are biased AI outputs primarily a design issue or the outcomes of systemic processes that are inherent in the data collection, preprocessing, and management processes? The paper is systemic and observes lifecycle of AI systems, including gathering data and applying models, propagation of errors, omissions, and representational biases in each of the stages. The paper is based on the use of secondary data, case study and literature to determine the relative significance of data pipelines and model structures in biased results. The results have shown that even though algorithmic decisions have the potential to reinforce inequities, the primary sources of bias are historical data, sampling, labeling anomalies, and data pipeline infrastructure constraints. In addition, the study highlights that ill-managed data ecosystems tend to strengthen the social imbalances that already exist, thus compromising the ethical soundness of AI systems. The article offers a model of bias reduction, in which more focus is directed on data governance and transparency, as well as continuous auditing in addition to responsible model development. It also emphasizes the importance of an interdisciplinary team of the data scientists, domain experts, and policymakers to provide fair AI implementation. This research will provide a more holistic view of equity in AI and help to create more socially responsible and trustworthy intelligent systems because it will change the emphasis to the data ecosystem rather than just models.

Garbage In, Bias Out: Are AI Models or Data Pipelines the Real Problem

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

ISSN block

Impact Factor

PUBLISHER DETAILS

Indexing

Journal Ethics Policies

Visitor-counter

Latest publications

Information

Language