Breadcrumb

Latest News

Information Loss in Neural Classifiers from Sampling

An estimator is limited to the information that it has about the variable it's estimating. But this information is limited to what the estimator has seen from the samples training it. The full information of a random variable cannot be transferred to an estimator by finite samples - some information is lost. This presentation analyzes...

Training machines to understand the Universe

Upcoming large-scale datasets in astrophysics will challenge our ability to effectively analyze and interpret the data. Surveys of the 2020s (e.g., Euclid, LSST, WFIRST and SPHEREx) will provide multiple deep views of the universe, each survey with its own observational characteristics such as noise levels, resolution, and wavelength coverage. How do we best interpret the...

Constructing Confidence Intervals for Selected Parameters

In large-scale problems, it is common practice to select important parameters by a procedure such as the BH procedure (Benjamini and Hochberg, 1995) and construct confidence intervals (CIs) for further investigation while the false coverage-statement rate (FCR) for the CIs is controlled at a desired level. Although the well-known BY CIs (Benjamini and Yekutieli, 2005)...

Spatial Analytics for Efficient and Equitable Public Transportation

Assessing the performance of public transportation services has long been an important yet challenging issue for transportation agencies and researchers. However, the performance evaluation of transportation services is complicated by an array of quantitative measures available to assess the goals and the diversity in the goals themselves, which usually include improving operational efficiency and providing...

Leveraging big datasets to understand how ecological communities respond to global change

Simultaneous ongoing changes to earth's ecosystems, including climate change and species invasions, are reshuffling ecological communities in space and time. Spatially, species distributions are shifting, often in species-specific ways, leading to novel communities. Changing climate is also altering species’ phenologies – i.e., the seasonal timing of life cycle events such as flowering, bird migration, or...

Multilevel Joint Modeling of Hospitalization and Survival in Patients on Dialysis

More than 720,000 patients with end-stage renal disease in the US require life-sustaining dialysis treatment that is predominantly received at local dialysis facilities. In this population of typically older patients with a high morbidity burden, hospitalization is frequent at a rate of about twice per patient-year. Aside from frequent hospitalizations, which is a major source...

Modeling Data Using Regression: Testing Conjectures Strongly

Too often, exploratory approaches to data analysis are used, even in situations in which confirmatory methods could be used. The contrast between exploratory and confirmatory approaches to analyses will be emphasized, and several examples will be presented that illustrate the advantages of confirmatory methods — particularly the avoidance of Type II errors when confirmatory methods...

Combined Multi-scale Modeling and Experimental Study of Roles of Cell-Matrix Interactions

Blood clot contraction plays an important role in prevention of bleeding and in thrombotic disorders. We will unveil and quantify the structural mechanisms of clot contraction at the level of single platelets. In contrast to other cell–matrix systems in which cells migrate along fibers, we will demonstrate that the “hand-over-hand” longitudinal pulling causes shortening and...

Plant species persistence in the face of climate change

Climate change threatens the persistence of native plants across California, and strategies are needed to facilitate resilience and conserve the most vulnerable species. Conserving species under climate change is complicated, however, because the state’s native flora are threatened by other global changes, including altered disturbance regimes, land use change, and invasive species. We developed an...

Scientific Data: a Chemical Engineer’s Perspective

https://jwulab.engr.ucr.edu/home/abouttheprofessor

Estimating the determinants of child growth faltering: notes on measurement, models and microdata

Prof. Joseph Cummins Department of Economics, UCR Abstract: Early life growth faltering due to nutritional deficiencies and disease environment currently affects the health, productivity and lifespan of hundreds of millions of adults worldwide, with another 150 million children currently experiencing stunted growth. In this talk, I survey methods for estimating the effects of various individual-...

Some optics-related data science and machine learning problems

In this talk, I'll first present a broad overview of several established fields in physical and quantum optics that are ripe for data science and machine learning applications. Secondly, I'll present some metadata related to different fields of optics in a mini science-technology-and-society study. I'll end by speaking about my current research, which presents novel...

P-ENCODE: Resolving plant gene activity in space over time

The mammalian ENCODE (ENCyclopedia of DNA Elements) project funded by the NIH has sought to define the features and elements of chromatin, genes, RNAs and proteins that determine phenotype. Most ENCODE modules have focused on chromatin and DNA elements that regulate transcriptional activity. There has not been a plant “ENCODE” project. Yet the regulation of...

How methane emissions can be better counted and mitigated with “big” data

Methane is a greenhouse gas, and the second most important contributor to human-caused climate change. It is also an important target for climate mitigation policy in the state of California. However, the methane budget is poorly known, especially at scales that are most relevant for mitigation policy and planning. New observations of methane emissions, such...

Using data science to enhance protein biogenesis for biotechnology and medicine

Engineered proteins are at the heart of biotechnology and the biopharmaceutical industry. The research community now has several tools available to develop novel protein structures and functions. Computational methods can link fully designed structures back to sequence, and directed evolution can create enzymes that catalyze reactions not found in biology. However, these approaches rely on...

Making light work of dark matter - Algorithms for astronomical matter distributions

The current cosmological paradigm posits that at early times the distribution of matter in our universe had the statistics of a Gaussian process, and evolved primarily via gravitational collapse into the highly non-linear structures we see today. Both the numerical simulations and the observational catalogues produce big data that challenges the computational scaling of our...

An ‘instantaneous’ measure of dynamic functional connectivity

Assessing the function of late-stage cortical processing regions, such as prefrontal cortex, is notoriously challenging. Yet it is these areas that are hypothesized to subserve complex higher-order processes including self-evaluation of decisional uncertainty and even perceptual awareness. How can we measure the degree to which late-stage cortical processing areas have access to the representational content...

Tree Atlas of the California

Vegetation maps are a valuable resource for those interested in vegetation status and change. A map is a static baseline, but distributions frozen in time provide guidance on relationships between species and topography, geologic substrate, surface hydrology, and climate; vegetation maps in time-series can be used to document vegetation change. This atlas comprises maps of...

Open Geospatial Data Science

In this talk, I present an overview of spatial data science research occurring at the newly formed UCR Center for Geospatial Sciences. I first examine the broader context of geospatial data science and its intersection with the open source and open science movements. Next, I provide an overview of the open source Python Spatial Analysis...

Inference of Chromosome-length Haplotypes using Genomic Data of Three to Five Single Gametes

Knowledge of chromosome-length haplotypes will not only advance our understanding of the relationship between DNA and phenotypes, but will also promote a variety of important genetic applications. The current diploid-based phasing methods are costly and only produce haplotype fragments, whereas the alternatives based on analysis of haploid gametes, which are still in their early development...