Ghasemi, Lida2025-04-292025-04-292025-04-292025-04-28https://hdl.handle.net/10012/21675The oil and gas industry, a major contributor to climate change, faces increasing scrutiny from regulators, investors, and the public to adopt sustainable practices. Many companies respond with ambitious environmental commitments yet concerns about greenwashing, raise doubts about the authenticity of these claims. Greenwashing allows firms to maintain a sustainable public image while continuing high-emission activities, creating a gap between corporate rhetoric and actual environmental action. This study introduces a data-driven approach to measure greenwashing by analyzing corporate disclosures from 50 publicly traded oil and gas firms (2011–2020). Using Large Language Models (LLMs) and Natural Language Processing (NLP), we assess sustainability reports, annual filings, and 10-K reports, developing a quantitative greenwashing index to distinguish between symbolic and substantive environmental claims. Findings reveal systematic discrepancies between voluntary sustainability reports and legally binding financial disclosures. Larger firms and those in highly regulated markets engage in more sophisticated greenwashing, and the practice has increased over time, particularly after the 2015 Paris Agreement. This research highlights the need for stricter regulations, standardized reporting, and independent audits to ensure corporate sustainability commitments lead to real environmental action.engreenwashingsustainability reportingoil and gas industrycorporate environmental disclosurelarge language models (LLMs)natural language processing (NLP)corporate social responsibility (CSR)text classificationmachine learningenvironmental accountabilitytransparency in corporate reportingAI in sustainabilityFROM CLAIMS TO REALITY: A DATA-DRIVEN APPROACH TO MEASURING GREENWASHING WITH LARGE LANGUAGE MODELSMaster Thesis