Articles with "categorical data" as a keyword



Efficient binary embedding of categorical data using BinSketch

Sign Up to like & get
recommendations!
Published in 2022 at "Data Mining and Knowledge Discovery"

DOI: 10.1007/s10618-021-00815-y

Abstract: In this work, we present a dimensionality reduction algorithm, aka. sketching, for categorical datasets. Our proposed sketching algorithm Cabin constructs low-dimensional binary sketches from high-dimensional categorical vectors, and our distance estimation algorithm Cham computes a… read more here.

Keywords: embedding categorical; binary embedding; using binsketch; categorical data ... See more keywords

Effective interpretable learning for large-scale categorical data

Sign Up to like & get
recommendations!
Published in 2024 at "Data Mining and Knowledge Discovery"

DOI: 10.1007/s10618-024-01030-1

Abstract: Large scale categorical datasets are ubiquitous in machine learning and the success of most deployed machine learning models rely on how effectively the features are engineered. For large-scale datasets, parametric methods are generally used, among… read more here.

Keywords: bin; categorical data; framework; network ... See more keywords
Photo from archive.org

Dealing with categorical data in a multidimensional context: The multidimensional balanced worth.

Sign Up to like & get
recommendations!
Published in 2021 at "Social science research"

DOI: 10.1016/j.ssresearch.2021.102561

Abstract: This paper presents an evaluation protocol that permits evaluating the relative performance of a set of populations in a multidimensional context when outcomes are measured in terms of categorical variables. This problem appears in many… read more here.

Keywords: multidimensional context; balanced worth; data multidimensional; dealing categorical ... See more keywords

Go Multivariate: Recommendations on Bayesian Multilevel Hidden Markov Models with Categorical Data.

Sign Up to like & get
recommendations!
Published in 2023 at "Multivariate behavioral research"

DOI: 10.1080/00273171.2023.2205392

Abstract: The multilevel hidden Markov model (MHMM) is a promising method to investigate intense longitudinal data obtained within the social and behavioral sciences. The MHMM quantifies information on the latent dynamics of behavior over time. In… read more here.

Keywords: multilevel hidden; categorical data; hidden markov;

Time series analysis of categorical data using auto-odds ratio function

Sign Up to like & get
recommendations!
Published in 2018 at "Statistics"

DOI: 10.1080/02331888.2017.1421196

Abstract: ABSTRACT In this paper, we consider the auto-odds ratio function (AORF) as a measure of serial association for a stationary time series process of categorical data at two different time points. Numerical measures such as… read more here.

Keywords: time series; time; function; categorical data ... See more keywords
Photo from wikipedia

Using categorical data analyses in determination of dust-related occupational diseases in mining

Sign Up to like & get
recommendations!
Published in 2018 at "International Journal of Occupational Safety and Ergonomics"

DOI: 10.1080/10803548.2018.1531535

Abstract: Dust-related occupational diseases are common in the mining sector. It is important to identify employees who have high potential for these diseases and to investigate the factors affecting disease... read more here.

Keywords: occupational diseases; dust related; data analyses; using categorical ... See more keywords

DicePlot: A package for high dimensional categorical data visualization.

Sign Up to like & get
recommendations!
Published in 2024 at "Bioinformatics"

DOI: 10.1093/bioinformatics/btaf337

Abstract: SUMMARY Visualization of multidimensional, categorical data is a common challenge across scientific domains and, in particular, the life sciences. The goal is to create a comprehensive overview of the underlying data which enables one to… read more here.

Keywords: visualization; diceplot package; package high; categorical data ... See more keywords

Automatic Fuzzy Clustering Using Non-Dominated Sorting Particle Swarm Optimization Algorithm for Categorical Data

Sign Up to like & get
recommendations!
Published in 2019 at "IEEE Access"

DOI: 10.1109/access.2019.2927593

Abstract: Categorical data clustering has been attracted a lot of attention recently due to its necessary in the real-world applications. Many clustering methods have been proposed for categorical data. However, most of the existing algorithms require… read more here.

Keywords: fuzzy clustering; automatic fuzzy; categorical data; number clusters ... See more keywords

Graph Enhanced Fuzzy Clustering for Categorical Data Using a Bayesian Dissimilarity Measure

Sign Up to like & get
recommendations!
Published in 2023 at "IEEE Transactions on Fuzzy Systems"

DOI: 10.1109/tfuzz.2022.3189831

Abstract: Categorical data are widely available in many real-world applications, and to discover valuable patterns in such data by clustering is of great importance. However, the lack of a decent quantitative relationship among categorical values makes… read more here.

Keywords: fuzzy clustering; bayesian dissimilarity; categorical data; dissimilarity measure ... See more keywords

From Whole to Part: Reference-Based Representation for Clustering Categorical Data

Sign Up to like & get
recommendations!
Published in 2020 at "IEEE Transactions on Neural Networks and Learning Systems"

DOI: 10.1109/tnnls.2019.2911118

Abstract: Dissimilarity measures play a crucial role in clustering and, are directly related to the performance of clustering algorithms. However, effectively measuring the dissimilarity is not easy, especially for categorical data. The main difficulty of the… read more here.

Keywords: representation; based representation; reference; categorical data ... See more keywords

Solving missing categorical data in questionnaire responses for automated classification

Sign Up to like & get
recommendations!
Published in 2025 at "Bulletin of Electrical Engineering and Informatics"

DOI: 10.11591/eei.v14i4.8785

Abstract: Handling missing categorical data is critical for maintaining the accuracy and reliability of automatic classification tasks, particularly in mental health screening based on questionnaire responses. This study investigates several imputation methods, including last observation carried… read more here.

Keywords: classification; categorical data; questionnaire responses; imputation ... See more keywords