|
List of Posters
-
Non-rigid Shape Correspondence and Description Using Geodesic Field Estimate Distribution
Anirban Mukhopadhyay & Austin New
Non-rigid shape description and analysis is an unsolved problem in computer graphics.
Shape analysis is a fast evolving research field due to the wide availability of 3D shape databases.
Widely studied methods for this family of problems include the Gromov Hausdorff distance, Bag-of-Features and diffusion geometry.
The limitations of the Euclidian distance measure in the context of isometric deformation have made geodesic distance a de-facto
standard for describing a metric space for non-rigid shape analysis. In this work, we propose a novel geodesic field space-based
approach to describe and analyze non-rigid shapes from a point correspondence perspective.
- Proactive Dissemination of Medical Information
Chinmay Atul Murugkar
The most widely used literature searching for medical information is NCBI`s PubMed system. Through this system a medical practitioner can get information needed, but requires significant efforts and research to get the required document. Also, in this fast information flow world of medical papers it becomes really hard to keep up with the latest information or run through each paper to extract required information. Few years back Plavix was the medicine which was used commonly for the patients of heart attack, but then after a year or two side effects of this medicine like increase in blood pressure and acidity were realized which lead to dire consequences. After that it was realized that paper talking about this side effects was published long back and went unseen by most of doctors. So, this research makes effort to proactively disseminate the right information on web to right medical practitioner using semantic search algorithm explained in this poster.
-
Multi-Context-Aware Anti-Vandalism Techniques for Wikipedia
Raga Sowmya Tummalapenta
Vandalism is a deliberate activity that compromises Wikipedia integrity. In general around 8% of Wikipedia edits are vandalized. Elusive vandalism is a type of vandalism that doesn't contain normal characteristics of vandalism and hence hard to detect. Examples include abusive words, changing dates. In my research am implementing three multi context aware techniques to detect vandalism in Wikipedia.
Approach1: Context Aware Approach: Context aware approach includes detecting vandalism based on the context in which it is used. It identifies words that are out of context with the existing words in an article.
Approach2: Trustworthy ranking approach: Trustworthy ranking approach includes detecting vandalism based on the top ranked documents of the edits using a trust worthy engine.
Approach3: Trustworthy ranking on Wikipedia approach: Trustworthy ranking approach includes detecting vandalism (confined to only Wikipedia articles) based on the top ranked documents of the edits using a trust worthy engine.
In all the three approaches features are extracted and data trained to test the accuracy of the methods.
-
Indexing and Querying Big Time-evolving Graphs
Phani Rohit Mullangi & Venkata Sri Ram Akella
Existing indexing systems can answer reachability queries on large static graphs efficiently. But when graphs evolve with time it gets expensive to index every version, at the same time on demand traversal is slow. So currently we are working on a way in which queries on TEG's can be answered much faster. We index intermediate versions and answer the reachability queries based on edits. We are using interval based indexing system for indexing, and the kind of queries we are dealing with right now are reachability queries on large trees.
-
Building an Infrastructure to Study the Lifecyles of Malicious Domains
Kevin Jonathan Warrick
DNS is an intermediate for a great breadth of computer security threats such as the propagation of malware, the command and control of botnets, and spam and phishing campaigns. Studying the lifecycles of malicious domain names will provide insight into the many classes of criminal networks that depend on DNS; however, existing frameworks [1] use only passive methods to reconstruct partial views of global DNS traffic. We have developed an infrastructure using the Hadoop distributed computing framework, called Digger, which given a list of malicious domains and a list of open recursive servers actively queries and aggregates DNS records. Digger will provide data for future research per threat class on the longevity of malicious domains, the number of IP addresses domains resolve to, how often malicious domains change IPs, and perhaps even correlate domains within the same malicious network.
[1] https://sie.isc.org/
-
Semantically Enriched Task Specification and Management in Crowdsourcing for Ontology Verification
Amna Basharat & Shima Dastgheib
Research in crowdsourcing is still in its infancy. Crowdsourcing is task-oriented, though, efficiency and quality of task specification and management are not well-addressed. The overall aim of this work is two-fold: 1) To harness the power of Semantics to increase the efficiency of Task Management in Crowdsourcing 2) To exploit the potential of Crowd to address the challenges of Ontology Verification. In order to meet this aim we propose a Generic Task Specification and Management Mechanism in Crowdsourcing and contextualize this to the domain of ontology verification - a challenging and complicated task which requires human expert knowledge on one hand and is time consuming for the human on the other hand.
-
Learning Functional Microstates of Human Brain
Xiang Li
The concept of brain microstates has been used and validated for years in Electroencephalography (EEG) field. However, there is a lack of whole-brain definition of microstates based on functional Magnetic Resonance Imaging (fMRI) data. We explored the possibility defining and characterizing functional microstates from resting state fMRI data via Fisher dictionary learning and sparse coding algorithms. It turned out that there are 16 microstates that are reproducible across different healthy populations and datasets, suggesting that these 16 microstates effectively represent the functional brain state space in resting. We applied the microstate models to study PTSD subjects, and found that they exhibited two additional altered microstates, which differentiates them from normal controls.
-
Argumentation on the World Wide Web
Mustafa Nural
Argumentation is the process of reasoning with or without support from evidences and facts. To date, most research is focused on representation of the argumentation process using computer models and creation of reasoning systems for these models. However, little is done for capturing argumentation automatically from an unstructured text. This work particularly focuses on capturing argumentation from unstructured text by using frame semantics. Frame semantics is a well-established natural language understanding paradigm that captures the semantics of the sentence based on the verb. We show how frame semantics can be used to capture arguments from the text and how those captured arguments can be consumed in the domain of World Wide Web.
-
Unicode and Domain-Specific Languages
Michael Cotterell
As recent programming languages provide improved conciseness and flexibility of syntax, the development of embedded or internal Domain-Specific Languages (DSLs) has increased. This project has created DSLs for Modeling & Simulation and Optimization. One of the goals is to make executable models (source code) concise, readable, and in a form familiar to experts in these domains. In some cases the code looks very similar to textbook formulas. This poster provides i) a brief introduction of Unicode in General Purpose Languages (GPLs), ii) some examples of Unicode-enriched functions using the Unicode DSL developed for this project, and iii) highlights some of features of Scala that make all this possible. The notation developed is clear and concise which should lead to improved usability and extendibility.
-
Efficient SPARQL querying on the Cloud
Aniruddha Girish Jagalpure
SPARQL is a RDF Query Language. Currently, the frameworks that are used for SPARQL are Jena, Sesame and BigOWLIM. But these frameworks cannot be scaled to support large RDF graphs. Further, these frameworks are designed to work on a single machine. Hence, they cannot be used for RDF graphs, which are of Terabytes in size and have billions of tuples. Experiments have shown that Jena, the most popular framework, can support only 10Million Tuples in 2GB of Main Memory. Hence, we are proposing a new system which can handle large RDF graphs and provide the user the results quickly by splitting the RDF graphs based upon predicates and 'type' and store them across the nodes. The nodes will then index the data in these files and store it in our data-structure. Further, the nodes will work together to answer the SPARQL query of the user.
-
Anatomy-guided Discovery of Consistent Connectivity-based Cortical Landmarks
Xi Jiang
Recently, several multimodal DTI/fMRI studies demonstrate that consistent white matter fiber connection patterns can predict brain function. Although a variety of approaches have been proposed to discover large-scale cortical landmarks with common structural connection profiles, the rich anatomical information such as gyral/sulcal folding patterns and structural connection pattern homogeneity have not been incorporated into existing DTI/fMRI studies yet. This paper presents a novel anatomy-guided discovery framework that defines and optimizes a dense map of cortical landmarks that possess group-wise consistent anatomical and connectional profiles. This framework integrates reliable and rich anatomical information for landmark initialization, optimization and prediction, which are formulated and solved as an energy minimization problem. Validation results based on fMRI data demonstrate that these landmarks are producible, predictable and exhibit accurate structural and functional correspondences across individuals and populations, offering a universal and individualized brain reference system for neuroimaging research.
-
Ontology-Assisted Question Answering System using CRF
Amir Asiaee
"Ontology-Assisted Question Answering System using CRF" aims to utilize state-of-the-art semantic web and natural language processing approaches for effective automated question answering for life science users. A part of this project is a web-based tool called Cuebee (Knowledge-Driven Query Formulation), which is targeted at non-computer expert users for SPARQL generation. It uses ontology schemas to guide users step-by-step in formulating queries in an intuitive way. Cuebee takes advantage of an OWL-DL reasoner called Pellet, which is equipped with SPARQL-DL (an extension of SPARQL suitable for OWL-DL ontologies). A significant enhancement to Cuebee is to add ability of formulating questions, which are posed in natural language. In order to process these questions we use Conditional Random fields (CRFs) - a probabilistic model which is designed to label and segment sequences of observations. We extract important entities from questions find corresponding classes and relationships from related ontologies and feed them to Cuebee to build the final SPARQL query.
-
Using Reinforcement Learning to Model True Team Behavior in Uncertain Multiagent Settings in Interactive DIDs
Muthukumaran Chandrasekaran
In this work, we investigate modeling true team behavior of agents using reinforcement learning (RL) in the context of the Interactive Dynamic Influence Diagram (I-DID) framework for solving multiagent decision making problems under uncertain settings. This presents us with a non-trivial problem that requires setting up learning agents in such a way that they could learn the existence of external factors such as the existence of other agents, thereby modeling teamwork in a non-traditional but systematic way. So far, we have only experimented with I-DIDs on 2-agent settings. So, we also seek to show the framework's flexibility by designing & testing I-DIDs for 3 or more agent scenarios.
-
Indexing Micro-Blogs using Distributed Architecture for Real-Time Search
Akshay Vivek Choche
In order to perform efficient and effective real time search it is necessary that the newly created content be available for search as soon as it is created. A simplest approach in overcoming this problem is to index the data as soon as it is created. But this approach cannot be adopted in case of real-time microblogging systems like "Twitter" where millions of concurrent users are simultaneously updating their micro-blogs. Recently twitter reported over 340 millions tweets* per day [1]. In a scenario similar to this where high update load is expected, it is not possible to generate index on each tweet, and simultaneously allow users to perform real-time keyword based search as the two goals mentioned are contradictory which leads to high contentions for lock. The system presented in this poster can be used to effectively index the real-time tweets, by classifying them as Distinguished Tweets or Noisy Tweets based on the fact if they would appear in top-K results of recent query set Q. Distinguished Tweets are index in real time where as the Noisy Tweets are appended in a log file, which are periodically indexed in an offline manner. This system is an extension of Tweet Index [2] a system presented by Chun Chen et al. We are planning to deploy this system in a distributed environment in order amplify its scalability, also we are including a caching mechanism which was not present in the previous system, we believe that by caching results for the popular queries it is possible to decrease the response time drastically and improve overall system performance. Apart from that we also plan to extend the ranking mechanism, such that it incorporates both the similarity between query and tweets as well as the time when the Tweet was submitted. Currently "Twitter" ranks its search result based on timestamp, which is inefficient.
[1] http://thenextweb.com/socialmedia/2012/03/21/twitter-has-over-140-million-active-users-sending-over-340-million-tweets-a-day/
[3] A. Sun et al. Searching blogs and news: a study on popular queries
[2] Chun Chen et al. TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets
-
Generalized and Bounded Policy Iteration for Finitely Nested Interactive POMDPs: Scaling Up
Ekhlas Sonu
Policy iteration algorithms for partially observable Markov decision processes (POMDP) offer the benefits of quick convergence and the ability to operate directly on the solution, which usually takes the form of a finite state controller. However, the controller tends to grow quickly in size across iterations due to which its evaluation and improvement become costly. Bounded policy iteration provides a way of keeping the controller size fixed while improving it monotonically until convergence. In this paper, we show how we may perform policy iteration in settings formalized by the interactive POMDP framework. Although policy iteration has been extended to decentralized POMDPs, the context there is strictly cooperative. Its generalization here makes it useful in non-cooperative settings as well. As interactive POMDPs involve modeling others, we ascribe nested controllers to predict others' actions, with the benefit that the controllers compactly represent the model space. We evaluate our approach on multiple problem domains, and demonstrate its properties and scalability.
-
Qrator: A Curation Tool for Glycan Structures
Matthew Eavenson
Curating glycans is normally a tedious, labor-intensive process. With the use of tree matching algorithms, guided by a Glycotree, this process can become considerably more efficient and less error-prone. A Glycotree is a canonical representation of all possible glycan structures of a given type (e.g. N-glycan, O-glycan, etc.). Qrator is an application that harnesses Glycotrees to assist in ensuring that glycan structures are biosynthetically valid, with the ultimate purpose of adding them to the GlycO ontology. This level of curation will ensure high quality structures in GlycO, and will alleviate much of the tedious analysis previously required of glycobiologists. Currently, there is only one Glycotree (N-glycans) in the GlycO ontology, but we are in the initial stages of constructing other Glycotrees, such as the O-glycan tree, from scratch.
-
Closed Loop Prediction Based on Contact Map
Liang Ding & Joseph Robertson
According to the theory of Trifonov et al. [1], closed loops, which they dubbed loop-n-lock structures, are returns of the polypeptide chain trajectory to close contact with itself. They are proposed as fundamental, and ancient units of protein structure. Evidence supports their contention that globular proteins are composed of linear combinations of such closed loops. Thus they have great potential as units of a hierarchical folding model. In this study, we aimed to predict closed loops accurately using only sequence information, specifically a contact map generated by NNcon [2]. We also investigate the relationship of closed loops with aligned segments from structural neighborhood protein chain pairs from the Dali data set [5]. The results show protein pairs, which have significant structure similarities are more likely to have conserved closed loops. To support a hierarchical folding theory, we explore the secondary structures within closed loops. Our investigations show that 80% of closed loops entirely contain some secondary structures, lending support to the theory that the closed loop is a higher-level intermediate structure of proteins.
-
Adaptive Federation of SPARQL queries over Linked Open Data
Amir R. Abdolrashidi, Shima Dastgheib & Amna Basharat
Linked Open Data (LOD) is the most promising realization of Semantic Web Vision so far. SPARQL is the W3C recommended language to ask queries from the RDF data sets distributed and linked over LOD. However, scalability and adaptivity of engines that execute such queries are not comparable to SQL counterparts. In this work, we first conduct a comprehensive survey and analysis of existing approaches used to query the LOD. Afterwards we recognize five major cases for adaptivity and propose a scalable architecture that meets these requirements. Our approach is inspired by adaptive query processing mechanisms that have been extensively utilized in query optimization in domain of relational databases. Nevertheless, tailoring implementation of those mechanisms to SPARQL is more challenging as we deal with ever growing scale of the linked open data.
-
Modeling Deep Strategic Reasoning by Humans in Competitive Games
Xia Qu
The prior literature on strategic reasoning by humans of the sort, what do you think that I think that you think, is that humans generally do not reason beyond a single level. However, recent evidence suggests that if the games are made competitive and therefore representationally simpler, humans generally exhibited behavior that was more consistent with deeper levels of recursive reasoning. We seek to computationally model behavioral data that is consistent with deep recursive reasoning in competitive games. We use generative, process models built from agent frameworks that simulate the observed data well and also exhibit psychological intuition.
-
Improved Convergence of Iterative Ontology Alignment using Block-Coordinate Descent
Uthayasanker Thayasivam
A wealth of ontologies, many of which overlap in their scope, has made aligning ontologies an important problem. Consequently, several algorithms now exist for automatically aligning ontologies, with mixed success in their performances. Crucial challenges for these algorithms involve scaling to large-ontologies, and as applications of ontology alignment evolve, performing the alignment in real time without compromising on the quality of the alignment. A class of alignment-algorithms is iterative and often consumes more time than others while delivering solutions of high quality. We present a novel and general approach for speeding up the optimization process utilized by these algorithms. Specifically, we use the technique of block-coordinate-descent in order to possibly improve the speed of convergence of the iterative-alignment techniques. We integrate this approach into three well-known alignment systems and show that the enhanced systems generate similar or improved alignments in significantly less time on a comprehensive testbed of ontology pairs.
-
Inferring Group-wise Consistent Multimodal Brain Networks via Multi-view Spectral Clustering
Hanbo Chen
Quantitative modeling and analysis of structural and functional brain networks based on diffusion tensor imaging (DTI)/functional MRI (fMRI) data has received extensive interest recently. However, the regularity and variability of these structural or functional brain networks across multiple neuroimaging modalities and across individuals are largely unknown. This paper presents a novel approach to infer group-wise consistent brain sub-networks from multimodal DTI/fMRI datasets via multi-view spectral clustering of cortical networks, which were constructed on our recently developed and extensively validated large-scale cortical landmarks. We applied the proposed algorithm on 80 multimodal structural and functional brain networks of 40 healthy subjects, and obtained consistent multimodal brain sub-networks within the group. Our experiments demonstrated that the derived brain sub-networks have improved inter-modality and inter-subject consistency.
-
Descriptive Reinforcement Learning in Sequential Games
Roi Ceren
Humans often appear to make irrational decisions when operating under uncertain environments. Much work in behavioral game theory has been done in order to illuminate the operating conditions of human decision making. Though this work is exclusively in the single-shot normative game setting, it provides keen insight into cognitive biases, such as forgetfulness and misattribution of rewards, which influence decisions that people make. Taking the concepts in behavioral game theory, my research attempts to apply parameters representing these cognitive biases to sequential games. In collaboration with the Psychology Department, we have run several experiments in which participants observe an Uninhabited Aerial Vehicle (UAV) navigate from a start sector to a goal sector. Participants are asked what the likelihood of overall success is at each sector. Our goal was to design a reinforcement learning algorithm, complete with behavioral parameters, that could mimic the same predictions as participants.
-
Modeling Restrictions in Ontology for Improved Ontology Alignment
Tejas D. Chaudhari
An Ontology represents knowledge as a set of concepts within a domain, and the relationship within those concepts. With the expanding field of the sematic web the ontologies are becoming a ubiquitous approach to represent knowledge. While expanding a knowledge domain, one may need to find a mapping between two or more ontologies to find the overlapping knowledge. This is where ontology alignment is used. Due to the rapidly increasing number of ontologies there is a need to make the process of alignment automated and accurate. While aligning two ontologies, their lexical and structural similarity of the concepts and relationships are utilized. Along with concepts, ontology also includes restrictions and Boolean expressions. These are represented as anonymous classes. Being anonymous, these classes cannot be used in lexical matching. Also they cannot be completely ignored as they carry significant semantic information about surrounding concepts. In the research there was no tool discovered which dealt with the issue of aligning the restrictions or the Boolean expressions. To align restrictions and Boolean expressions, one needs to align the underlying anonymous classes and this Poster represents an approach to do exactly that.
-
Clickbot Detection via Behavioral Analysis
Chris Neasbitt
Automated click fraud has developed into an ever more prevalent problem as pay-per-click advertising has become the primary revenue source for no end-user cost internet services and applications. Current clickbot, malware designed to perpetrate click fraud, detection techniques focus mainly on reverse engineering and manual analysis of malware binaries. We propose an alternative method to detect clickbots via analysis of the HTTP traffic generated by the malware. By generating graphs, which represent the relationships between HTTP requests and responses we can determine how each individual malware operates in relation to well known advertising networks. We use this information to identify patterns in the traffic, which can be associated with click fraud. We then replay suspicious traffic traces through our instrumented browser to determine whether a request is generated through manual or automated means. By combining these two approaches we can distinguish clickbots from other malware without code inspection.
-
SeaS: Sensors as a Service
Victor Lawson
Wireless sensor networks "WSN's", have become the standard data collection tool for researchers collecting environmental information. So frequently that a "Sensor Web" has been theorized. This unique consumer utilization of sensor technology combined with current ubiquitous computing demands will open future markets for sensor data cloud services. Existing sensor network information systems exist for environmental and healthcare monitoring, smarthome and security, military surveillance and many more. However, a costly duplication of effort is being carried out by the creators of these systems. A consolidated cloud service could unite these efforts and reduce the costs associated with redundant systems. This distributed cloud service will require layers for query languages and planning, resource management, pricing models, rich user interfaces, network security and access control and adaptability to changing technology. The consumer demand for this brand of ubiquitous data will increase as more sensor networks are deployed and cloud service models are developed.
-
PeerRush: Mining for Unwanted P2P Traffic
Babak Rahbarinia
A botnet is a network of compromised hosts (bots), which are controlled by an attacker. They and particularly peer-to-peer (P2P) botnets are one of the most prevalent threats to cyber-security. The ineffectiveness of available detection methods made them very popular to use for botmasters. Their stealthiness is mainly due to 2 reasons: 1) Decentralization (botnets constitute many bot-infected machines). 2) Encryption (encrypted packet payloads). In this work we propose a novel botnet detection system that is able to detect unwanted botnet traffic in a network and can cope with stealthy botnets with encrypted packet payloads. To achieve this, we first identify the hosts that are running P2P applications, and then we match their P2P management flows (not data flows) against our trained one-class and multi-class classifiers to distinguish the benign P2Ps from malicious ones. We trained and tested our classifiers using the real-world data. The results show that the system effectively detects the type of P2P traffic with low FP rate.
|
|