SCIE, Ei, INSPEC, JST, AJ, MR, CA, DBLP, etc.
Edited by: Editorial Board of Journal Of Computer Science and Technology
Guo-Jie Li, Editor-in-Chief,
P.O. Box 2704, Beijing 100190, P.R. China Sponsored by: Institute of Computing Technology, CAS & China Computer Federation Undertaken by: Institute of Computing Technology, CAS Published by: SCIENCE PRESS, BEIJING, CHINA Distributed by: China: All Local Post Offices Other Countries: Springer
In this article, we survey the main achievements of moving objects with transportation modes that span the past decade. As an important kind of human behavior, transportation modes reflect characteristic movement features and enrich the mobility with informative knowledge. We make explicit comparisons with closely related work that investigates moving objects by incorporating into location-dependent semantics and descriptive attributes. An exhaustive survey is offered by considering the following aspects:1) modeling and representing mobility data with motion modes; 2) answering spatio-temporal queries with transportation modes; 3) query optimization techniques; 4) predicting transportation modes from sensor data, e.g., GPS-enabled devices. Several new and emergent issues concerning transportation modes are proposed for future research.
The rapid development of social networks has resulted in a proliferation of user-generated content (UGC), which can benefit many applications. In this paper, we study the problem of identifying a user's locations from microblogs, to facilitate effective location-based advertisement and recommendation. Since the location information in a microblog is incomplete, we cannot get an accurate location from a local microblog. As such, we propose a global location identification method, Glitter. Glitter combines multiple microblogs of a user and utilizes them to identify the user's locations. Glitter not only improves the quality of identifying a user's location but also supplements the location of a microblog so as to obtain an accurate location of a microblog. To facilitate location identification, Glitter organizes points of interest (POIs) into a tree structure where leaf nodes are POIs and non-leaf nodes are segments of POIs, e.g., countries, cities, and streets. Using the tree structure, Glitter first extracts candidate locations from each microblog of a user which correspond to some tree nodes. Then Glitter aggregates these candidate locations and identifies top-k locations of the user. Using the identified top-k user locations, Glitter refines the candidate locations and computes top-k locations of each microblog. To achieve high recall, we enable fuzzy matching between locations and microblogs. We propose an incremental algorithm to support dynamic updates of microblogs. We also study how to identify users' trajectories based on the extracted locations. We propose an effective algorithm to extract high-quality trajectories. Experimental results on real-world datasets show that our method achieves high quality and good performance, and scales well.
As a fundamental operation in LBS (location-based services), the trajectory similarity of moving objects has been extensively studied in recent years. However, due to the increasing volume of moving object trajectories and the demand of interactive query performance, the trajectory similarity queries are now required to be processed on massive datasets in a real-time manner. Existing work has proposed distributed or parallel solutions to enable large-scale trajectory similarity processing. However, those techniques cannot be directly adapted to the real-time scenario as it is likely to generate poor balancing performance when workload variance occurs on the incoming trajectory stream. In this paper, we propose a new workload partitioning framework, ART (Adaptive Framework for Real-Time Trajectory Similarity), which introduces practical algorithms to support dynamic workload assignment for RTTS (real-time trajectory similarity). Our proposal includes a processing model tailored for the RTTS scenario, a load balancing framework to maximize throughput, and an adaptive data partition manner designed to cut off unnecessary network cost. Based on this, our model can handle the large-scale trajectory similarity in an on-line scenario, which achieves scalability, effectiveness, and efficiency by a single shot. Empirical studies on synthetic data and real-world stream applications validate the usefulness of our proposal and prove the huge advantage of our approach over state-of-the-art solutions in the literature.
Resource description framework (RDF) stream is useful to model spatio-temporal data. In this paper, we propose a framework for large-scale RDF stream processing, LRSP, to process general continuous queries over large-scale RDF streams. Firstly, we propose a formalization (named CT-SPARQL) to represent the general continuous queries in a unified, unambiguous way. Secondly, based on our formalization we propose LRSP to process continuous queries in a common white-box way by separating RDF stream processing, query parsing, and query execution. Finally, we implement and evaluate LRSP with those popular continuous query engines on some benchmark datasets and real-world datasets. Due to the architecture of LRSP, many efficient query engines (including centralized and distributed engines) for RDF can be directly employed to process continuous queries. The experimental results show that LRSP has a higher performance, specially, in processing large-scale real-world data.
With the widespread use of smart phones and mobile Internet, social network users have generated massive geo-tagged tweets, photos and videos to form lots of informative trajectories which reveal not only their spatio-temporal dynamics, but also their activities in the physical world. Existing spatial trajectory query studies mainly focus on analyzing the spatio-temporal properties of the users' trajectories, while leaving the understanding of their activities largely untouched. In this paper, we incorporate the semantics of the activity information embedded in trajectories into query modelling and processing, with the aim of providing end users more informative and meaningful results. To this end, we propose a novel trajectory query that not only considers the spatio-temporal closeness but also, more importantly, leverages a proven technique in text mining field, probabilistic topic modelling, to capture the semantic relatedness of the activities between the data and query. To support efficient query processing, we design a hierarchical grid-based index by integrating the probabilistic topic distribution on the substructures of trajectories and their spatio-temporal extent at the corresponding level of the index hierarchy. This specialized structure enables a top-down search algorithm to traverse the index while pruning unqualified trajectories in spatial and topical dimensions simultaneously. The experimental results on real-world datasets demonstrate the good efficiency and scalability performance of the proposed indices and trajectory search methods.
We extend the constraint data model to allow complex objects and study the expressive power of various query languages over this sort of constraint databases. The tools we use come in the form of collapse results which are well established in the context of first-order logic. We show that the natural-active collapse with a condition and the activegeneric collapse carry over to the second-order logic for structures with o-minimality property and any signature in the complex value relations. The expressiveness results for more powerful logics including monadic second-order logic, monadic second-order logic with fix-point operators, and fragments of second-order logic are investigated in the paper. We discuss the data complexity for second-order logics over constraint databases. The main results are that the complexity upper bounds for three theories, MSO + Lin, MSO + Poly, and Inflationary Datalogactcv,¬(SC, M) without powerset operator are ∪i ∑iNC1, NCH=∪i ∑iNC, and AC0/poly, respectively. We also consider the problem of query closure property in the context of embedded finite models and constraint databases with complex objects and the issue of how to determine safe constraint queries.
Trip recommendation has become increasingly popular with the rapid growth of check-in data in locationbased social networks. Most existing studies focused only on the popularity of trips. In this paper, we consider further the usability of trip recommendation results through spatial diversification. We thereby formulate a new type of queries named spatial diversified top-k routes (SDkR) query. This type of queries finds k trip routes with the highest popularity, each of which starts at a given starting point, consumes travel time within a given time budget, and passes through points of interest (POIs) of given categories. Any two trip routes returned are diversified to a certain degree defined by the spatial distance between the two routes. We show that the SDkR problem is NP-hard. We propose two precise algorithms to solve the problem. The first algorithm starts with identifying all candidate routes that satisfy the query constraints, and then searches for the k-route combination with the highest popularity. The second algorithm identifies the candidate routes and builds up the optimal k-route combination progressively at the same time. Further, we propose an approximate algorithm to obtain even higher query efficiency with precision bounds. We demonstrate the effectiveness and efficiency of the proposed algorithms on real datasets. Our experimental results show that our algorithms find popular routes with diversified POI locations. Our approximate algorithm saves up to 90% of query time compared with the baseline algorithms.
Coordinated partitioning and resource sharing have attracted considerable research interest in the field of real-time multiprocessor systems. However, finding an optimal partition is widely known as NP-hard, even for independent tasks. A recently proposed resource-oriented partitioned (ROP) fixed-priority scheduling that partitions tasks and shared resources respectively has been shown to achieve a non-trivial speedup factor guarantee, which promotes the research of coordinated scheduling to a new level. Despite the theoretical elegance, the schedulability performance of ROP scheduling is restricted by the heuristic partitioning methods used in the original study. In this paper, we address the partitioning problem for tasks and shared resources under the ROP scheduling. A unified schedulability analysis framework for the ROP scheduling is proposed in the first place. A sophisticated partitioning approach based on integer linear programming (ILP) is then proposed based on the unified analysis. Empirical results show that the proposed methods improve the schedulability of ROP scheduling significantly, and the runtime complexity for searching a solution is reduced prominently compared with other ILP-based approaches as well.
One way to increase storage density is using a shingled magnetic recording (SMR) disk. We propose a novel use of SMR disks with RAID (redundant array of independent disks) arrays, specifically building upon and compared with a basic RAID 4 arrangement. The proposed scheme (called RAID 4SMR) has the potential to improve the performance of a traditional RAID 4 array with SMR disks. Our evaluation shows that compared with the standard RAID 4, when using update in-place in RAID arrays, RAID 4SMR with garbage collection not just can allow the adoption of SMR disks with a reduced performance penalty, but offers a performance improvement of up to 56%.
Despite the expanded efforts, the vehicular ad-hoc networks (VANETs) are still facing many challenges such as network performances, network scalability and context-awareness. Many solutions have been proposed to overcome these obstacles, and the edge computing, an extension of the cloud computing, is one of them. With edge computing, communication, storage and computational capabilities are brought closer to end users. This could offer many benefits to the global vehicular network including, for example, lower latency, network off-loading and context-awareness (location, environment factors, etc.). Different approaches of edge computing have been developed:mobile edge computing (MEC), fog computing (FC) and cloudlet are the main ones. After introducing the vehicular environment background, this paper aims to study and compare these different technologies. For that purpose their main features are compared and the state-ofthe-art applications in VANETs are analyzed. In addition, MEC, FC, and cloudlet are classified and their suitability level is debated for different types of vehicular applications. Finally, some challenges and future research directions in the fields of edge computing and VANETs are discussed.
This study focuses on the problem of handoff minimization for a set of users moving in a wireless network. This problem is analyzed by considering two cases for the user's movement under access point capacity constraints:1) all users move together, and 2) each user can have their chosen path within the network. In the first case, we propose an optimal competitive ratio algorithm for the problem. However, in the second case, having the connectivity assumption, that is, "if a user is connected to an access point so long that the received signal strength of the access point is not less than a specified threshold, the user should continue his/her connection", we prove that no approach can reduce the number of unnecessary handoffs in an offline setting. However, without connectivity assumption, we present an optimal deterministic algorithm with the competitive ratio of nΔ for this problem under online setting, where n is the number of users and Δ is the maximum number of access points which cover any single point in the environment. Also, we prove that the randomized version of the algorithm achieves an expected competitive ratio of O(log Δ).
Inter-process communication (IPC) provides a message passing mechanism for information exchange between applications. It has been long believed that IPCs can be abused by malware writers to launch collusive information leak using two or more applications. Much work on privacy protection focuses on the simple information leak caused by the individual applications and lacks effective approaches to preventing the collusive information leak caused by IPCs between multiple processes. In this paper, we propose a hybrid approach to prevent the collusive information leak based on information flow control. Our approach combines static information flow analysis and dynamic runtime checking together. Information leak caused by individual processes is prevented through static information flow control, and dynamic checking is done at runtime to prevent the collusive information leak. Such a combination may effectively reduce the runtime overhead of pure dynamic checking, and reduce false-alarms in pure static analysis. We develop this approach based on an abstract and simplified programming model, and formalize a novel definition of the leak-freedom property as our target security property. A simulation-based proof technique is used to prove that our approach is able to guarantee leak-freedom. All proofs are mechanized in Coq.
Convolutional neural networks (CNNs) have shown tremendous progress and performance in recent years. Since emergence, CNNs have exhibited excellent performance in most of classification and segmentation tasks. Currently, the CNN family includes various architectures that dominate major vision-based recognition tasks. However, building a neural network (NN) by simply stacking convolution blocks inevitably limits its optimization ability and introduces overfitting and vanishing gradient problems. One of the key reasons for the aforementioned issues is network singularities, which have lately caused degenerating manifolds in the loss landscape. This situation leads to a slow learning process and lower performance. In this scenario, the skip connections turned out to be an essential unit of the CNN design to mitigate network singularities. The proposed idea of this research is to introduce skip connections in NN architecture to augment the information flow, mitigate singularities and improve performance. This research experimented with different levels of skip connections and proposed the placement strategy of these links for any CNN. To prove the proposed hypothesis, we designed an experimental CNN architecture, named as Shallow Wide ResNet or SRNet, as it uses wide residual network as a base network design. We have performed numerous experiments to assess the validity of the proposed idea. CIFAR-10 and CIFAR-100, two well-known datasets are used for training and testing CNNs. The final empirical results have shown a great many of promising outcomes in terms of performance, efficiency and reduction in network singularities issues.