[Efficient Server AI Track]: Lowering the Costs of Fine-Tuning Foundation Models | Kisaco Research

Pre-training Foundation Models is prohibitively expensive and therefore impossible for many companies. This is especially true if the models are Large Language Models (LLMs). However, people hope that Foundation Models will live up to the promise of learning more generally than classical Artificial Intelligence (AI) models. The dream is that if you provide just a few examples to Foundation Models, they could extrapolate the high-level, abstract representation of the problem and learn how to accomplish tasks that they have never been trained to execute before. So, the question is, how can you lower the cost of fine-tuning pre-trained Foundation Models for your needs? This is what we will discuss in this panel. We make available to you our personal experience, synthetized in a set of principles, so that you can discover how we found ways to lower the cost of fine-tuning pre-trained Foundational Models across multiple domains. 

Speaker(s): 
Moderator

Author:

Fausto Artico

Head of Innovation and Data Science
GSK

Fausto has two PhDs (Information & Computer Science respectively), earning his second master’s and PhD at the University of California, Irvine. Fausto also holds multiple certifications from MIT, Columbia University, London School of Economics and Political Science, Kellogg School of Management, University of Cambridge and soon also from the University of California, Berkeley. He has worked in multi-disciplinary teams and has over 20 years of experience in academia and industry.

As a Physicist, Mathematician, Engineer, Computer Scientist, and High-Performance Computing (HPC) and Data Science expert, Fausto has worked on key projects at European and American government institutions and with key individuals, like Nobel Prize winner Michael J. Prather. After his time at NVIDIA corporation in Silicon Valley, Fausto worked at the IBM T J Watson Center in New York on Exascale Supercomputing Systems for the US government (e.g., Livermore and Oak Ridge Labs).

Fausto Artico

Head of Innovation and Data Science
GSK

Fausto has two PhDs (Information & Computer Science respectively), earning his second master’s and PhD at the University of California, Irvine. Fausto also holds multiple certifications from MIT, Columbia University, London School of Economics and Political Science, Kellogg School of Management, University of Cambridge and soon also from the University of California, Berkeley. He has worked in multi-disciplinary teams and has over 20 years of experience in academia and industry.

As a Physicist, Mathematician, Engineer, Computer Scientist, and High-Performance Computing (HPC) and Data Science expert, Fausto has worked on key projects at European and American government institutions and with key individuals, like Nobel Prize winner Michael J. Prather. After his time at NVIDIA corporation in Silicon Valley, Fausto worked at the IBM T J Watson Center in New York on Exascale Supercomputing Systems for the US government (e.g., Livermore and Oak Ridge Labs).

Panellists

Author:

Lisa Cohen

Director of Data Science for Gemini, Google Assistant, and Search Platforms
Google

Lisa Cohen is Director of Data Science for Gemini (formerly "Bard"), Google Assistant, and Search Platforms. She leads an organization of data scientists at Google, responsible for using data to create excellent user experiences across these products, and partnering closely with Product, Engineering, and User Experience Research. Formerly, Lisa was Head of Data Science and Engineering for Twitter, helping drive the strategy and direction of the Twitter product, through machine learning, metric development, experimentation and causal analyses. Before Twitter, Lisa led the Azure Customer Growth Analytics organization as part of Microsoft Cloud Data sciences. Her team was responsible for analyzing OKRs, informing data-driven decisions, and developing data science models to help customers be successful on Azure. Lisa worked at Microsoft for 17yrs, and also helped develop multiple versions of Visual Studio. She holds Bachelor and Masters degrees from Harvard in Applied Mathematics. You can follow Lisa on LinkedIn and Medium.

Lisa Cohen

Director of Data Science for Gemini, Google Assistant, and Search Platforms
Google

Lisa Cohen is Director of Data Science for Gemini (formerly "Bard"), Google Assistant, and Search Platforms. She leads an organization of data scientists at Google, responsible for using data to create excellent user experiences across these products, and partnering closely with Product, Engineering, and User Experience Research. Formerly, Lisa was Head of Data Science and Engineering for Twitter, helping drive the strategy and direction of the Twitter product, through machine learning, metric development, experimentation and causal analyses. Before Twitter, Lisa led the Azure Customer Growth Analytics organization as part of Microsoft Cloud Data sciences. Her team was responsible for analyzing OKRs, informing data-driven decisions, and developing data science models to help customers be successful on Azure. Lisa worked at Microsoft for 17yrs, and also helped develop multiple versions of Visual Studio. She holds Bachelor and Masters degrees from Harvard in Applied Mathematics. You can follow Lisa on LinkedIn and Medium.

Author:

Jeff Boudier

Product Director
Hugging Face

Jeff Boudier is a product director at Hugging Face, creator of Transformers, the leading open-source NLP library. Previously Jeff was a co-founder of Stupeflix, acquired by GoPro, where he served as director of Product Management, Product Marketing, Business Development and Corporate Development.

Jeff Boudier

Product Director
Hugging Face

Jeff Boudier is a product director at Hugging Face, creator of Transformers, the leading open-source NLP library. Previously Jeff was a co-founder of Stupeflix, acquired by GoPro, where he served as director of Product Management, Product Marketing, Business Development and Corporate Development.

Author:

Helen Byrne

VP, Solution Architect
Graphcore

Helen leads the Solution Architects team at Graphcore, helping innovators build their AI solutions using Graphcore’s Intelligence Processing Units (IPUs). She has been at Graphcore for more than 5 years, previously leading AI Field Engineering and working in AI Research, working on problems in Distributed Machine Learning. Before landing in the technology industry, she worked in Investment Banking. Her background is in Mathematics and she has a MSc in Artificial Intelligence.

Helen Byrne

VP, Solution Architect
Graphcore

Helen leads the Solution Architects team at Graphcore, helping innovators build their AI solutions using Graphcore’s Intelligence Processing Units (IPUs). She has been at Graphcore for more than 5 years, previously leading AI Field Engineering and working in AI Research, working on problems in Distributed Machine Learning. Before landing in the technology industry, she worked in Investment Banking. Her background is in Mathematics and she has a MSc in Artificial Intelligence.