Bad cellular reception is a universal frustration, but keeping the public at four bars takes a lot of work. Cellular providers constantly test, tune, and develop their infrastructure, with a close eye on their performance – and that of their competitors. Optimizing the network requires keeping tabs on signal strength, and on identifying problems quickly so that enhancements can be made, but high-quality data isn’t available for every time and location. To fill in the gaps, operators need to be able to predict when and where signal strength will be strong, and when it will fall short of customer expectations. In work being presented at the 2019 World Wide Web Conference, NCASD Lab PI Butts with collaborators Emmanouil Alimpertis, Athina Markopoulou, and Konstantinos Psounis show how machine learning methods can be used to assist with this problem. Using flexible modeling techniques that can easily adapt to the peculiarities of local geography, the team’s approach is able to stitch together even irregularly and unevenly sampled measurements of signal strength from users’ phones into a “map” that allows signal strength to be accurately predicted at any location in the area, at any time of day. This approach improves on prior efforts that used less flexible techniques, that did not consider both location and time, or that were limited to particular types of measurement. The team’s work demonstrates how the latest developments in data analysis are helping to maintain and improve the lifelines on which we all depend.
When celebrities or other well-known figures go public with the bad news that they have cancer, there’s an opportunity for more than sympathy. As shown in a new paper in the journal Cancer Control by NCASD Lab PI Carter Butts and alumnus Ben Gibson, with collaborators Sarah Vos and Jeannette Sutton, these announcements can serve as “focusing events” that direct public attention to the disease, and can be used by public health agencies to promote prevention, diagnosis, and treatment messages. Using over 1.2 million cancer-related Twitter messages over a nine-day period, the team showed that a prominent actor’s disclosed cancer diagnosis and treatment generated a large spike in related discussion regarding the disease, with particular enhancement in messaging around diagnostics. An examination of message content also highlighted ways in which public perception of and communication about cancer diverges from what is often assumed by public health experts, indicating a gap between cancer communicators and the broader audience they frequently attempt to reach. The team’s research shows that bad news about cancer can create important windows of opportunity to facilitate discussion – but that leveraging those windows requires a communication strategy that takes into account how members of the public grapple with the disease.
NCASD Lab PI Carter Butts recently traveled to Washington, DC, to receive recognition as a newly elected Fellow of the American Association for the Advancement of Science. Each year, the AAAS recognizes a small number of scientists across all fields for outstanding contributions to the promotion and/or progress of scientific knowledge. Butts was recognized for his contributions to the mathematical, statistical, and computational modeling of relational structure and dynamics, in both human and non-human systems. More information can be found in the announcement on the AAAS web site.
Complex social, biological, or other networks often arise from a wide range of mechanisms, acting within a heterogeneous and often dynamic environment. This complicated “stew” of factors gives rise to networks that are anything but clean and elegant: rather, they are decidedly “lumpy,” consisting of myriad overlapping subgroups of varying size and consistency. While this heterogeneity can make analysis difficult, it can also provide clues to the drivers of network formation, since different mechanisms – and different features of the local environment – tend to produce groups of characteristic size and composition. To unpack these generating mechanisms, one needs a way to “dissect” subgroups within larger networks and discover how they relate to their members’ attributes. In a new paper, NACSD lab alumnus Sean Fitzhugh and lab PI Carter Butts offer an approach to this problem. The paper, which appears in the journal Social Networks, exploits an easily implemented, non-parametric technique to identify the ranges of subgroup sizes over which individuals with particular attributes are especially likely (or unlikely) to be found together. Applying the method to Facebook friendship networks from a number of universities, Fitzhugh and Butts show that the method is able to detect idiosyncratic but important features of the social landscape – like the undergraduate housing systems used at schools like CalTech and Wellesley – that indirectly shape student friendship networks. By looking for shared attributes that foster ties but that are not the building blocks of larger groups, Fitzhugh and Butts show how the precursors of bridging ties can be identified (in this case, attending the same high school ). Armed with this new approach, network analysts can efficiently dissect networks ranging from friendship and advice to interactions among biomolecules, revealing clues to the hidden processes that give rise to them.
Measuring social, biological, and other networks is often a difficult and expensive process, involving surveys, experiments, or other time consuming and costly procedures. Given that no measurement is perfect, how do we get the greatest “bang for the buck” when trying to assess network ties? A new paper by NCASD lab members Francis Lee and Carter Butts, published in the journal Social Networks, addresses this question. Lee and Butts consider the common situation in which measurements on a potential edge are obtained from both parties to that edge, and these potentially discrepant reports are to be integrated. When the reports agree, the problem is easy: go with the consensus. But what happens when the reports disagree? Is it better to require both parties to agree that the edge is present to count it (“mutual assent”), or is it better to count an edge if either party claims it (“unilateral nomination”)? Applying a hierarchical Bayesian model to extensive data on networks from a variety of settings, Lee and Butts are able to assess the performance of these simple heuristics, and render a verdict: so long as the true network is fairly sparse, requiring mutual assent gives better results than unilateral nomination. As they show, the reason for this is surprisingly simple. In a sparse network (one with relatively few edges per individual), there are many more opportunities to invent spurious ties that are not actually present than to miss ties that are actually there – so, even if an informant is less likely to invent ties than to omit them, using the method that guards against that error turns out to yield better results. These results provide direct and easily followed guidance for researchers working with social network data, and are also applicable to settings such as protein-protein interaction networks in which similar types of error also arise. By improving the quality of our measurements, we can ensure that researchers get the most out of their hard-won data.
Heliconius butterflies are unusual in consuming pollen — unlike butterflies that live on nectar — a task made more difficult by the fact that they lack the mouthparts needed to chew it. How do these insects manage to devour pollen grains too big to swallow, without the ability to chew them? The answer lies in their saliva: Heliconius butterflies have evolved special enzymes to break down pollen into edible components. In a recent collaboration between the Briscoe, Martin, and NCASD labs, the team has identified the enzyemes (called cocoonases) used by the butterflies to pre-digest pollen and modeled their structures. The team found numerous different cocoonases used by Heliconius butterflies, but there is catch: while evolutionary conventional wisdom would suggest that different species would have different enzymes, in fact each species has the whole complement. The mystery deepened when the researchers discovered that the different cocoonases had identical active sites, implying that they did not substantially differ in their preferred chemical targets. Why maintain a whole spectrum of enzymes that all do the same thing? The solution to the mystery lies in the outside of the enzyme. The team found that while their active sites were the same, the Heliconius cocoonases varied systematically in terms of their surface properties. These differences appear to have evolved to cope with one of the challenges of pollen-eating: it’s heterogeneous stuff. To get digestive enzymes into every nook and cranny of a pollen grain, a butterfly needs an arsenal of biomolecules, each of which being ideal for diffusing into a different type of chemical environment. Armed with these “chemical teeth,” Heliconius butterflies are able to tap into a rich foodstuff that would otherwise be too tough to swallow.
Key to unlocking the mysteries of the cocoonase was a novel protocol for modeling biological molecules developed as a collaboration between the NCASD and Martin labs. In addition to solving basic problems in evolutionary biology, this work holds the potential to lead to new classes of enzymes with the ability to work in a wider range of medical or industrial settings. The research appears in the Journal of the Royal Society, and can be found at http://rspb.royalsocietypublishing.org/content/285/1870/20172037.
Simulating the structure of complex networks is an important challenge for problems ranging from the modeling organizational structure to understanding the behavior of protein aggregates. While recent years have seen many innovations in this area, obtaining provably high quality simulations for networks with complex dependence has remained an elusive goal. In a forthcoming paper in the Journal of Mathematical Sociology, NCASD Lab PI Carter Butts shows how this can be done. Introducing a family of simulation algorithms inspired by a technique called “coupling from the past,” Butts demonstrates the feasibility of obtaining exact draws from even highly complex sampling distributions, avoiding the sometimes problematic approximations inherent in current methods. Because these algorithms can be used for a very broad class of network models, they can be applied in a wide range of settings both within and beyond the social sciences. This work illustrates the potential for computational methods to extend the reach of scientific practice, a major theme of research in the NCASD Lab.
NCASD Lab alumna Xuhong Zhang has hit the ground running as a postdoctoral fellow in the lab of Professor Debashis Ghosh, chair of the Department of Biostatistics and Informatics at the Colorado School of Public Health on the University of Colorado medical campus. After graduating this past summer, Dr. Zhang relocated to the Denver, Colorado area to begin her new position. She brings along her expertise in the analysis of complex data sets, development of novel techniques for analysis of dynamic data, and bioinformatics. At the NCASD lab, Dr. Zhang’s work included pathbreaking research on the adaptation of spectroscopic ideas to data on human interaction and the study of urban ecology, as well as work on novel approaches to the prediction of enzyme structure and function. We wish Dr. Zhang all the best as she embarks on this next chapter of her career!
NCASD Postdoc Gianmarc Grazioli, Lab PI Carter Butts, and Prof. Ioan Andricioaei from the UCI Chemistry Department have published new results showing how the performance of molecular dynamics simulations can be improved with a little help from machine learning. These results are contained in their forthcoming paper, “Automated Placement of Interfaces in Conformational Kinetics Calculations Using Machine Learning,” to appear in the Journal of Chemical Physics. Their new technique employs a machine learning approach known as a Support Vector Machine to automatically define high dimensional reaction coordinates for calculating chemical kinetics. This approach dramatically reduces the cost of studying the complex configurational changes of large biomolecules, such as proteins and DNA, as well as the cost of simulating high-dimensional systems such as those associated with complex chemical reactions. Understanding the complex motions of biomolecules and the kinetics of chemical reactions is essential not only for a deeper fundamental understanding of the molecular machinery that makes life possible, but also for such applications as the computational design of drug molecules and novel materials.
Lab PI Carter Butts has been elected to two offices in the American Sociological Association: the Section Council of the ASA Section on Methodology, and Chair-Elect of the ASA Section on Mathematical Sociology. Butts is currently serving a term on the Section Council of the ASA Section on Mathematical Sociology, and will transition from this office to the office of Chair in the coming year.
Sections of the ASA serve the sociological community by supporting research in specific fields of the discipline. The Section on Methodology supports work on novel techniques and practical advances in methods for the measurement and analysis of social phenomena, while the Section on Mathematical Sociology supports the development and use of mathematical, computational, and other formal approaches to the study of social systems. Research by NCASD lab members is frequently featured in conference activities by both sections, and we are pleased to have this opportunity to make further contributions to these important communities.