Publications
# GENWISE: Thematic Discovery from Textual Data
“In this work, we introduce GENWISE - a generative AI-based framework designed to streamline extracting and organizing key information from textual data. Focusing on the prevalent issue in business where significant time is spent on manual data analysis, our framework employs cutting-edge generative AI, embedding, and clustering techniques towards a thematic discovery. We further deliver hierarchical thematic representations, enhancing the ease of understanding for users at different levels. Our methodology includes precise issue extraction through generative AI, utilization of the Retrieval-Augmented Generation framework for improved accuracy, and a 20% improvement in cluster coherency using the Enhanced Community Detection algorithm. This comprehensive pipeline is optimized explicitly for industrial settings, offering a significant leap in efficiency and thematic representation for complex data sets…”
# Comparative Analysis of Transformers for Modeling Tabular Data: A Casestudy using Industry Scale Dataset
“We perform a comparative analysis of transformer-based models designed for modeling tabular data, specifically on an industry-scale dataset. While earlier studies demonstrated promising outcomes on smaller public or synthetic datasets, the effectiveness did not extend to larger industry-scale datasets. The challenges identified include handling high-dimensional data, the necessity for efficient pre- processing of categorical and numerical features, and addressing substantial computational requirements. To overcome the identified challenge…”