U.S. flag

An official website of the United States government, Department of Justice.

NCJRS Virtual Library

The Virtual Library houses over 235,000 criminal justice resources, including all known OJP works.
Click here to search the NCJRS Virtual Library

Large-scale Cost-aware Classification Using Feature Computational Dependency Graph

NCJ Number
302699
Journal
IEEE Transactions on Knowledge and Data Engineering Dated: 2019 Pages: 1-1
Author(s)
Q. Li; et al
Date Published
2019
Length
1 page
Annotation

To improve the high memory and computational costs in the CAFH model, this article proposes an equivalent Accelerated CAFH (ACAFH) model based on the lossless heterogeneous hypergraph decomposition.

Abstract

With the rapid growth of real-time machine learning applications, the process of feature selection and model optimization requires implementation to integrate with the constraints on computational budgets. A specific computational resource in this regard is the time needed to evaluate predictions on test instances. The joint optimization problem of prediction accuracy and prediction-time efficiency draws more and more attention in the data mining and machine learning communities. The runtime cost is dominated by the feature generation process that contains significantly redundant computations across different features that share the same computational component in practice. Eliminating such redundancies would obviously reduce the time costs in the feature generation process. The authors previous cost-aware classification using feature computational dependencies heterogeneous Hypergraph (CAFH) model has achieved excellent performance on the effectiveness. In the big data era, the high dimensionality caused by the heterogeneous data sources leads to the difficulty in fitting the entire hypergraph into the main memory and the high computational cost during the optimization process. Simply partitioning the features into batches cannot give the optimal solution since it will lose some feature dependencies across the batches. To improve the high memory and computational costs in the CAFH model, we propose an equivalent Accelerated CAFH (ACAFH) model based on the lossless heterogeneous hypergraph decomposition. An efficient and effective nonconvex optimization algorithm based on the alternating direction method of multipliers (ADMM) is developed to optimize the ACAFH model. The time and space complexities of the optimization algorithm for the ACAFH model are three and one polynomial degrees less than the authors’ previous algorithm for the CAFH model, respectively. Extensive experiments demonstrate the proposed ACAFH model achieves competitive performance on the effectiveness and much better performance on the efficiency. (publisher abstract modified)