Articles with "apache spark" as a keyword



Photo by campaign_creators from unsplash

Halvade somatic: Somatic variant calling with Apache Spark

Sign Up to like & get
recommendations!
Published in 2022 at "GigaScience"

DOI: 10.1093/gigascience/giab094

Abstract: Abstract Background The accurate detection of somatic variants from sequencing data is of key importance for cancer treatment and research. Somatic variant calling requires a high sequencing depth of the tumor sample, especially when the… read more here.

Keywords: halvade somatic; somatic variant; sequencing data; apache spark ... See more keywords
Photo by jamie452 from unsplash

Unsupervised Graph Anomaly Detection Algorithms Implemented in Apache Spark

Sign Up to like & get
recommendations!
Published in 2018 at "Lobachevskii Journal of Mathematics"

DOI: 10.1134/s1995080218090184

Abstract: The graph anomaly detection problem occurs in many application areas and can be solved by spotting outliers in unstructured collections of multi-dimensional data points, which can be obtained by graph analysis algorithms. We implement the… read more here.

Keywords: algorithms; apache spark; anomaly detection; graph ... See more keywords
Photo by kellysikkema from unsplash

DECA: scalable XHMM exome copy-number variant calling with ADAM and Apache Spark

Sign Up to like & get
recommendations!
Published in 2019 at "BMC Bioinformatics"

DOI: 10.1186/s12859-019-3108-7

Abstract: BackgroundXHMM is a widely used tool for copy-number variant (CNV) discovery from whole exome sequencing data but can require hours to days to run for large cohorts. A more scalable implementation would reduce the need… read more here.

Keywords: copy number; spark; apache spark; number variant ... See more keywords
Photo from wikipedia

Concept and benchmark results for Big Data energy forecasting based on Apache Spark

Sign Up to like & get
recommendations!
Published in 2018 at "Journal of Big Data"

DOI: 10.1186/s40537-018-0119-6

Abstract: The present article describes a concept for the creation and application of energy forecasting models in a distributed environment. Additionally, a benchmark comparing the time required for the training and application of data-driven forecasting models… read more here.

Keywords: spark; apache spark; energy forecasting; big data ... See more keywords
Photo from wikipedia

Big Data in metagenomics: Apache Spark vs MPI

Sign Up to like & get
recommendations!
Published in 2020 at "PLoS ONE"

DOI: 10.1371/journal.pone.0239741

Abstract: The progress of next-generation sequencing has lead to the availability of massive data sets used by a wide range of applications in biology and medicine. This has sparked significant interest in using modern Big Data… read more here.

Keywords: big data; version; mpi; spark ... See more keywords
Photo from wikipedia

A distributed computing model for big data anonymization in the networks

Sign Up to like & get
recommendations!
Published in 2023 at "PLOS ONE"

DOI: 10.1371/journal.pone.0285212

Abstract: Recently big data and its applications had sharp growth in various fields such as IoT, bioinformatics, eCommerce, and social media. The huge volume of data incurred enormous challenges to the architecture, infrastructure, and computing capacity… read more here.

Keywords: big data; apache spark; data anonymization; model ... See more keywords
Photo from wikipedia

A Cloud-Based Framework for Large-Scale Log Mining through Apache Spark and Elasticsearch

Sign Up to like & get
recommendations!
Published in 2019 at "Applied Sciences"

DOI: 10.3390/app9061114

Abstract: The volume, variety, and velocity of different data, e.g., simulation data, observation data, and social media data, are growing ever faster, posing grand challenges for data discovery. An increasing trend in data discovery is to… read more here.

Keywords: cloud based; mining; apache spark; log mining ... See more keywords