Complex social, biological, or other networks often arise from a wide range of mechanisms, acting within a heterogeneous and often dynamic environment. This complicated “stew” of factors gives rise to networks that are anything but clean and elegant: rather, they are decidedly “lumpy,” consisting of myriad overlapping subgroups of varying size and consistency. While this heterogeneity can make analysis difficult, it can also provide clues to the drivers of network formation, since different mechanisms – and different features of the local environment – tend to produce groups of characteristic size and composition. To unpack these generating mechanisms, one needs a way to “dissect” subgroups within larger networks and discover how they relate to their members’ attributes. In a new paper, NACSD lab alumnus Sean Fitzhugh and lab PI Carter Butts offer an approach to this problem. The paper, which appears in the journal Social Networks, exploits an easily implemented, non-parametric technique to identify the ranges of subgroup sizes over which individuals with particular attributes are especially likely (or unlikely) to be found together. Applying the method to Facebook friendship networks from a number of universities, Fitzhugh and Butts show that the method is able to detect idiosyncratic but important features of the social landscape – like the undergraduate housing systems used at schools like CalTech and Wellesley – that indirectly shape student friendship networks. By looking for shared attributes that foster ties but that are not the building blocks of larger groups, Fitzhugh and Butts show how the precursors of bridging ties can be identified (in this case, attending the same high school ). Armed with this new approach, network analysts can efficiently dissect networks ranging from friendship and advice to interactions among biomolecules, revealing clues to the hidden processes that give rise to them.
Measuring social, biological, and other networks is often a difficult and expensive process, involving surveys, experiments, or other time consuming and costly procedures. Given that no measurement is perfect, how do we get the greatest “bang for the buck” when trying to assess network ties? A new paper by NCASD lab members Francis Lee and Carter Butts, published in the journal Social Networks, addresses this question. Lee and Butts consider the common situation in which measurements on a potential edge are obtained from both parties to that edge, and these potentially discrepant reports are to be integrated. When the reports agree, the problem is easy: go with the consensus. But what happens when the reports disagree? Is it better to require both parties to agree that the edge is present to count it (“mutual assent”), or is it better to count an edge if either party claims it (“unilateral nomination”)? Applying a hierarchical Bayesian model to extensive data on networks from a variety of settings, Lee and Butts are able to assess the performance of these simple heuristics, and render a verdict: so long as the true network is fairly sparse, requiring mutual assent gives better results than unilateral nomination. As they show, the reason for this is surprisingly simple. In a sparse network (one with relatively few edges per individual), there are many more opportunities to invent spurious ties that are not actually present than to miss ties that are actually there – so, even if an informant is less likely to invent ties than to omit them, using the method that guards against that error turns out to yield better results. These results provide direct and easily followed guidance for researchers working with social network data, and are also applicable to settings such as protein-protein interaction networks in which similar types of error also arise. By improving the quality of our measurements, we can ensure that researchers get the most out of their hard-won data.
Heliconius butterflies are unusual in consuming pollen — unlike butterflies that live on nectar — a task made more difficult by the fact that they lack the mouthparts needed to chew it. How do these insects manage to devour pollen grains too big to swallow, without the ability to chew them? The answer lies in their saliva: Heliconius butterflies have evolved special enzymes to break down pollen into edible components. In a recent collaboration between the Briscoe, Martin, and NCASD labs, the team has identified the enzyemes (called cocoonases) used by the butterflies to pre-digest pollen and modeled their structures. The team found numerous different cocoonases used by Heliconius butterflies, but there is catch: while evolutionary conventional wisdom would suggest that different species would have different enzymes, in fact each species has the whole complement. The mystery deepened when the researchers discovered that the different cocoonases had identical active sites, implying that they did not substantially differ in their preferred chemical targets. Why maintain a whole spectrum of enzymes that all do the same thing? The solution to the mystery lies in the outside of the enzyme. The team found that while their active sites were the same, the Heliconius cocoonases varied systematically in terms of their surface properties. These differences appear to have evolved to cope with one of the challenges of pollen-eating: it’s heterogeneous stuff. To get digestive enzymes into every nook and cranny of a pollen grain, a butterfly needs an arsenal of biomolecules, each of which being ideal for diffusing into a different type of chemical environment. Armed with these “chemical teeth,” Heliconius butterflies are able to tap into a rich foodstuff that would otherwise be too tough to swallow.
Key to unlocking the mysteries of the cocoonase was a novel protocol for modeling biological molecules developed as a collaboration between the NCASD and Martin labs. In addition to solving basic problems in evolutionary biology, this work holds the potential to lead to new classes of enzymes with the ability to work in a wider range of medical or industrial settings. The research appears in the Journal of the Royal Society, and can be found at http://rspb.royalsocietypublishing.org/content/285/1870/20172037.
Simulating the structure of complex networks is an important challenge for problems ranging from the modeling organizational structure to understanding the behavior of protein aggregates. While recent years have seen many innovations in this area, obtaining provably high quality simulations for networks with complex dependence has remained an elusive goal. In a forthcoming paper in the Journal of Mathematical Sociology, NCASD Lab PI Carter Butts shows how this can be done. Introducing a family of simulation algorithms inspired by a technique called “coupling from the past,” Butts demonstrates the feasibility of obtaining exact draws from even highly complex sampling distributions, avoiding the sometimes problematic approximations inherent in current methods. Because these algorithms can be used for a very broad class of network models, they can be applied in a wide range of settings both within and beyond the social sciences. This work illustrates the potential for computational methods to extend the reach of scientific practice, a major theme of research in the NCASD Lab.
NCASD Lab alumna Xuhong Zhang has hit the ground running as a postdoctoral fellow in the lab of Professor Debashis Ghosh, chair of the Department of Biostatistics and Informatics at the Colorado School of Public Health on the University of Colorado medical campus. After graduating this past summer, Dr. Zhang relocated to the Denver, Colorado area to begin her new position. She brings along her expertise in the analysis of complex data sets, development of novel techniques for analysis of dynamic data, and bioinformatics. At the NCASD lab, Dr. Zhang’s work included pathbreaking research on the adaptation of spectroscopic ideas to data on human interaction and the study of urban ecology, as well as work on novel approaches to the prediction of enzyme structure and function. We wish Dr. Zhang all the best as she embarks on this next chapter of her career!
NCASD Postdoc Gianmarc Grazioli, Lab PI Carter Butts, and Prof. Ioan Andricioaei from the UCI Chemistry Department have published new results showing how the performance of molecular dynamics simulations can be improved with a little help from machine learning. These results are contained in their forthcoming paper, “Automated Placement of Interfaces in Conformational Kinetics Calculations Using Machine Learning,” to appear in the Journal of Chemical Physics. Their new technique employs a machine learning approach known as a Support Vector Machine to automatically define high dimensional reaction coordinates for calculating chemical kinetics. This approach dramatically reduces the cost of studying the complex configurational changes of large biomolecules, such as proteins and DNA, as well as the cost of simulating high-dimensional systems such as those associated with complex chemical reactions. Understanding the complex motions of biomolecules and the kinetics of chemical reactions is essential not only for a deeper fundamental understanding of the molecular machinery that makes life possible, but also for such applications as the computational design of drug molecules and novel materials.
Lab PI Carter Butts has been elected to two offices in the American Sociological Association: the Section Council of the ASA Section on Methodology, and Chair-Elect of the ASA Section on Mathematical Sociology. Butts is currently serving a term on the Section Council of the ASA Section on Mathematical Sociology, and will transition from this office to the office of Chair in the coming year.
Sections of the ASA serve the sociological community by supporting research in specific fields of the discipline. The Section on Methodology supports work on novel techniques and practical advances in methods for the measurement and analysis of social phenomena, while the Section on Mathematical Sociology supports the development and use of mathematical, computational, and other formal approaches to the study of social systems. Research by NCASD lab members is frequently featured in conference activities by both sections, and we are pleased to have this opportunity to make further contributions to these important communities.
Lab Alumna Liana Landivar (Senior Researcher and Sociologist at the US Department of Labor) has released a new book, Mothers at Work: Who Opts Out? Landivar’s book examines a key question relating to the labor force participation of high-achieving American women: are mothers in managerial and professional occupations more likely to leave the labor force when they have children? Using four major government surveys, Mothers at Work offers a nationally representative account of mothers’ employment in 55 occupations and shows that women in managerial and professional occupations were the least likely to opt out but most likely to scale back by a few hours per week when they had children. By examining work-hour trends since 1970, this book shows that scaling back is taking place in a broader context of shorter work hours since the early 2000s across all groups of workers, including managers and professionals.
Landivar, who recently transitioned to the Department of Labor from the National Science Foundation, is an expert on gender and work, occupational trajectories, demography, and the STEM workforce. In addition to her appointment at the Department of Labor, Dr. Landivar holds an affiliation with the Maryland Population Research Center at the University of Maryland. Her work on the US labor force has been featured in White House and Congressional briefings, and has been covered widely in the media in outlets such as the Washington Post, the New York Times, Wall Street Journal, and Science.
Chitin, the polysaccharide-based material from which insects make their exoskeletons, is tough stuff – and digesting it is a tall order, especially for a plant. Nevertheless, some carnivorous plants, like the Cape Sundew, Drosera Capensis, are able to do just that. In a recent paper in Biochemica et Biophysica Acta, NCASD Lab PI Butts and members of the Martin Lab model the structures of 11 novel chitinases from D. Capensis, whose genome was published as part of the same collaborative effort this past year. Applying a combination of bioinformatics, molecular modeling, and techniques adapted from social network analysis, the team was able to predict the three-dimensional structure of each enzyme and gain insights into potential functional differences. Among the discoveries is a novel chitinase with two active domains that closely resembles a protein seen in microorganisms, but never before found in plants. These new enzymes can do more than bite bugs: chitin is also the essential component of fungal cell walls, and these molecules may hold promise for combating fungal growth on food or even fungal infections in humans. This work demonstrates the potential for fusing computational and data analytic techniques with biological know-how to quickly move from genomic “source code” to potentially valuable biomolecules.
The NCASD Lab is pleased to welcome Dr. Gianmarc Grazioli, who will be joining as a postdoctoral scholar as of spring quarter, 2017. Dr. Grazioli, who obtained his Ph.D. in the Andricioaei lab before a one-year stint in the Paesani lab at UCSD, brings with him a wealth of expertise in molecular modeling, particularly the use of modified potentials for importance sampling of trajectories in order to explore rare transitions. In his new position, Dr. Grazioli will contribute to the team’s work on modeling of protein aggregation, statistical methods for prediction of structural and functional properties of biological macromolecules, and the development of network analytic methods for the study of biological systems. A computational chemist by training, Grazioli adds to the lab’s diverse mix of disciplinary expertise (currently spanning sociology, statistics, electrical engineering, and computer science), and deepens the group’s bench in expertise related to simulation and sampling techniques.