Cloud Computing For Machine Learning And Cognitive Applications (MIT Press) Download ((FREE)) Pdf

CLICK HERE >>> __https://shoxet.com/2t7Ep7__

Deep learning (Hinton and Salakhutdinov 2006; Hinton et al. 2006; LeCun et al. 2015), an advanced statistical machine learning model, access to large amounts of data and faster computers has enabled great advances in machine learning and perception in recent 10 years. Based on the idea of DGCC, the intelligence learning mechanism of deep learning is studied and taken as a new artificial intelligence mechanism called hierarchical structuralism in this paper.It is also like the idea of hierarchical problem solving in granular computing (Yao 2011).

After examining the principles of different intelligent cognitive computing models, such as deep learning, logic neural network, fuzzy set, rough set, quotient space theory, cloud model (a qualitative and quantitative mapping model), formal concept analysis, three way decisions and clustering, a hierarchical structuralism is proposed for artificial intelligence based on DGCC, which is a combination of symbolism and connectionism. The HD\(^3\) characteristics of the hierarchical structuralism are discussed.

Abstract:Mobile edge computing (MEC) within 5G networks brings the power of cloud computing, storage, and analysis closer to the end user. The increased speeds and reduced delay enable novel applications such as connected vehicles, large-scale IoT, video streaming, and industry robotics. Machine Learning (ML) is leveraged within mobile edge computing to predict changes in demand based on cultural events, natural disasters, or daily commute patterns, and it prepares the network by automatically scaling up network resources as needed. Together, mobile edge computing and ML enable seamless automation of network management to reduce operational costs and enhance user experience. In this paper, we discuss the state of the art for ML within mobile edge computing and the advances needed in automating adaptive resource allocation, mobility modeling, security, and energy efficiency for 5G networks.Keywords: 5G; edge network; deep learning; reinforcement learning; caching; task offloading; mobile computing; edge computing; mobile edge computing; cloud computing; network function virtualization; slicing; 5G network standardization

In the past years, several popular distributed machine learning algorithms have been proposed, including decision rules [44], stacked generalization [45], meta-learning [46], and distributed boosting [47]. With the advantage of distributed computing for managing big volumes of data, distributed learning avoids the necessity of gathering data into a single workstation for central processing, saving time and energy. It is expected that more widespread applications of the distributed learning are on the way [42]. Similar to distributed learning, another popular learning technique for scaling up traditional learning algorithms is parallel machine learning [48]. With the power of multicore processors and cloud computing platforms, parallel and distributed computing systems have recently become widely accessible [42]. A more detailed description about distributed and parallel learning can be found in [49].

At the root of the success of kernel-based learning, the combination of high expressive power with the possibility to perform the numerous analyses has been developed in many challenging applications [65], e.g., online classification [66], convexly constrained parameter/function estimation [67], beamforming problems [68], and adaptive multiregression [69]. One of the most popular surveys about introducing kernel-based learning algorithms is [70], in which an introduction of the exciting field of kernel-based learning methods and applications was given.

There is no doubt that we are now swimming in an expanding sea of data that is too voluminous to train a machine learning algorithm with a central processor and storage. Instead, distributed frameworks with parallel computing are preferred. Alternating direction method of multipliers (ADMM) [72, 73] serving as a promising computing framework to develop distributed, scalable, online convex optimization algorithms is well suited to accomplish parallel and distributed large-scale data processing. The key merits of ADMM is its ability to split or decouple multiple variables in optimization problems, which enables one to find a solution to a large-scale global optimization problem by coordinating solutions to smaller sub-problems. Generally, ADMM is convergent for convex optimization, but it is lack of convergence and theoretical performance guarantees for nonconvex optimization. However, vast experimental evidence in the literature supports empirical convergence and good performance of ADMM [74]. A wide variety of applications of ADMM to machine learning problems for large-scale datasets have been discussed in [74].

In addition to distributed theoretical framework for machine learning to mitigate the challenges related to high volumes, some practicable parallel programming methods are also proposed and applied to learning algorithms to deal with large-scale data sets. MapReduce [75, 76], a powerful programming framework, enables the automatic paralleling and distribution of computation applications on large clusters of commodity machines. What is more, MapReduce can also provide great fault tolerance ability, which is important for tackling the large data sets. The core idea of MapReduce is to divide massive data into small chunks firstly, then, deal with these chunks in parallel and in a distributed manner to generate intermediate results. By aggregating all the intermediate results, the final result is derived. A general means of programming machine learning algorithms on multicore with the advantage of MapReduce has been investigated in [77]. Cloud-computing-assisted learning method is another impressive progress which has been made for data systems to deal with the volume challenge of big data. Cloud computing [78, 79] has already demonstrated admirable elasticity that bears the hope of realizing the needed scalability for machine learning algorithms. It can enhance computing and storage capacity through cloud infrastructure. In this context, distributed GraphLab, a framework for machine learning in the cloud, has been proposed in [80].

In fact, for big data processing, most machine learning techniques are not universal, that is to say, we often need to use specific learning methods according to different data. For example, in terms of high-dimensional datasets, representation learning seems to be a promising solution, which can learn the meaningful representations of the data that make it easier to extract useful information for achieving impressive performance on many dimensionality reduction tasks. While for large volumes of data, distributed and parallel learning methods have stronger advantages. If the data needed to be processed are drawn from different feature spaces and have different distributions, transfer learning will be a good choice which can intelligently apply knowledge learned previously to solve new problems faster. Frequently, in the context of big data, we have to face such a situation: data may be abundant but labels are scarce or expensive to obtain. To tackle this issue, active learning can achieve high accuracy using as few labeled instances as possible. In addition, nonlinear data processing is also another thorny problem, at this moment, kernel-based learning will be here with its powerful computational capability. Of course, if we want to deal with some data in a timely or (nearly) real-time manner, online learning and extreme learning machine can give us more help.

There is no doubt that SP is of uttermost relevance to timely big data applications such as real-time medical imaging, sentiment analysis from online social media, smart cities, and so on [110]. The interest in big-data-related research from the SP community is evident from the increasing number of papers submitted on this topic to SP-oriented journals, workshops, and conferences. In this section, we mainly discuss the close connections of machine learning with SP techniques for big data processing. Specifically, in Section 1.4.1, we analyze the existing studies on SP for big data from four different perspectives. Several representative literatures are presented. In Section 1.4.2, we provide a review of the latest research progress which is based on these typical works.

where f and g are convex functions. To obtain an optimal solution x * of (6) and the required assumptions on f and g, in this article, the authors presented three efficient big data approximation techniques, including first-order methods, randomization and parallel and distributed computation. They mainly referred to the scalable, randomized, and parallel algorithms for big data analytics. In addition, for the optimization problem in (6), ADMM can provide a simple distributed algorithm to solve its composite form, by leveraging powerful augmented Lagrangian and dual decomposition techniques. Although there are two caveats for ADMM, i.e., one is that closed-form solutions do not always exist and the other is that no convergence guarantees for more than two optimization objective terms, there are several recent solutions to address the two drawbacks, such as proximal gradient methods and parallel computing [111]. Specifically, from machine learning perspective, those bright techniques like scalable, parallel, and distributed mechanisms are also necessitated, and some applications of employing the recent convex optimization algorithms in learning methods such as support vector machines and graph learning have been appeared in recent years.

The latest progress based on [111]: A broad class of machine learning and SP problems can be formally stated as optimization problem. Based on the idea of convex optimization for big data analytics in [111], a randomized primal-dual algorithm was proposed in [115] for composite optimization, which could be used in the framework of large-scale machine learning applications. In addition, a consensus-based decentralized algorithm for a class of nonconvex optimization problems was investigated in [116], with the application to dictionary learning. 2b1af7f3a8