Pre-training Foundation Models is prohibitively expensive and therefore impossible for many companies. This is especially true if the models are Large Language Models (LLMs). However, people hope that Foundation Models will live up to the promise of learning more generally than classical Artificial Intelligence (AI) models. The dream is that if you provide just a few examples to Foundation Models, they could extrapolate the high-level, abstract representation of the problem and learn how to accomplish tasks that they have never been trained to execute before. So, the question is, how can you lower the cost of fine-tuning pre-trained Foundation Models for your needs? This is what we will discuss in this panel. We make available to you our personal experience, synthetized in a set of principles, so that you can discover how we found ways to lower the cost of fine-tuning pre-trained Foundational Models across multiple domains.
Fausto Artico
Fausto has two PhDs (Information & Computer Science respectively), earning his second master’s and PhD at the University of California, Irvine. Fausto also holds multiple certifications from MIT, Columbia University, London School of Economics and Political Science, Kellogg School of Management, University of Cambridge and soon also from the University of California, Berkeley. He has worked in multi-disciplinary teams and has over 20 years of experience in academia and industry.
As a Physicist, Mathematician, Engineer, Computer Scientist, and High-Performance Computing (HPC) and Data Science expert, Fausto has worked on key projects at European and American government institutions and with key individuals, like Nobel Prize winner Michael J. Prather. After his time at NVIDIA corporation in Silicon Valley, Fausto worked at the IBM T J Watson Center in New York on Exascale Supercomputing Systems for the US government (e.g., Livermore and Oak Ridge Labs).
Lisa Cohen
Lisa Cohen is Director of Data Science for Gemini (formerly "Bard"), Google Assistant, and Search Platforms. She leads an organization of data scientists at Google, responsible for using data to create excellent user experiences across these products, and partnering closely with Product, Engineering, and User Experience Research. Formerly, Lisa was Head of Data Science and Engineering for Twitter, helping drive the strategy and direction of the Twitter product, through machine learning, metric development, experimentation and causal analyses. Before Twitter, Lisa led the Azure Customer Growth Analytics organization as part of Microsoft Cloud Data sciences. Her team was responsible for analyzing OKRs, informing data-driven decisions, and developing data science models to help customers be successful on Azure. Lisa worked at Microsoft for 17yrs, and also helped develop multiple versions of Visual Studio. She holds Bachelor and Masters degrees from Harvard in Applied Mathematics. You can follow Lisa on LinkedIn and Medium.
Jeff Boudier
Jeff Boudier is a product director at Hugging Face, creator of Transformers, the leading open-source NLP library. Previously Jeff was a co-founder of Stupeflix, acquired by GoPro, where he served as director of Product Management, Product Marketing, Business Development and Corporate Development.
Helen Byrne
Helen leads the Solution Architects team at Graphcore, helping innovators build their AI solutions using Graphcore’s Intelligence Processing Units (IPUs). She has been at Graphcore for more than 5 years, previously leading AI Field Engineering and working in AI Research, working on problems in Distributed Machine Learning. Before landing in the technology industry, she worked in Investment Banking. Her background is in Mathematics and she has a MSc in Artificial Intelligence.