Apply directly to jobs in best companies
Search Companies / Jobs

Research Scientist - Science of Evaluations at AI Safety Institute
London, United Kingdom


Job Descrption
About The AI Safety Institute

The AI Safety Institute (AISI), launched at the 2023 Bletchley Park AI Safety Summit, is the world's first state-backed organization dedicated to advancing AI safety for the public interest. Our mission is to assess and mitigate risks from frontier AI systems, including cyber attacks on critical infrastructure, AI-enhanced chemical and biological threats, large-scale societal disruptions, and potential loss of control over increasingly powerful AI. In just one year, we've assembled one of the largest and most respected model evaluation teams, featuring renowned scientists and senior researchers from leading AI labs such as Anthropic, DeepMind, and OpenAI.

At AISI, we're building the premier institution for impacting both technical AI safety and AI governance. We conduct cutting-edge research, develop novel evaluation tools, and provide crucial insights to governments, companies, and international partners. By joining us, you'll collaborate with the... brightest minds in the field, directly shape global AI policies, and tackle complex challenges at the forefront of technology and ethics. Whether you're a researcher, engineer, or policy expert, at AISI, you're not just advancing your career – you're positioned to have significant impact in the age of artificial intelligence.

The Science of Evaluations Team

AISI’s Science of Evaluations team will conduct applied and foundational research focused on two areas at the core of our mission: (i) measuring existing frontier AI system capabilities and (ii) predicting the capabilities of a system before running an evaluation.

Measurement of Capabilities: the goal is to d evelop and apply rigorous scientific techniques for the measurement of frontier AI system capabilities, so they are accurate, robust, and useful in decision making. This is a nascent area of research which supports one of AISIs core products: conducting tests of frontier AI systems   and feeding back results, insights,  and recommendations to model developers and policy makers. 

The team will be an independent voice on the quality of our testing reports and the limitations of our evaluations. You will collaborate closely with researchers and engineers from the workstreams who develop and run our evaluations, getting into the details of their key strengths and weaknesses, proposing improvements, and developing techniques to get the most out of our results.

About

The key challenge is increasing the confidence in our claims about system capabilities, based on solid evidence and analysis. Directions we are exploring include:
• Running internal red teaming of testing exercises and adversarial collaborations with the evaluations teams, and developing “sanity checks” to ensure the claims made in our reports are as strong as possible. Example checks could include: performance as a function of context length, auditing areas with surprising model performance, checking for soft refusal performance issues, and efficient comparisons of system performance between pre-deployment and post-deployment testing.
• Running In -depth analyses of evaluations results to understand successes and failures and using these insights to create best practices for testing exercises.
• Developing our approach to uncertainty quantification and significance testing, increasing statistical power (given time and token constraints).
• Developing methods for inferring model capabilities across given domains from task or benchmark success rates, and assigning confidence levels to claims about capabilities.

Predictive Evaluations: the goal is to develop approaches to estimate the capabilities of frontier AI systems on tasks or benchmarks, before they are run. Ideally, we would be able to do this at some point early in the training process of a new model, using information about the architecture, dataset, or training compute. This research aims to provide us with advance warning of models reaching a particular level of capability, where additional safety mitigations may need to be put in place. This work is complementary to both safety cases —an AISI foundational research effort—and AISI’s general evaluations work.

This topic is currently an area of active research (e.g., Ruan et al. 2024 ), and we believe it is poised to develop rapidly. We are particularly interested in developing predictive evaluations for complex, long-horizon agent tasks, since we believe this will be the most important type of evaluation as AI capabilities advance. You will help develop this field of research, both by direct technical work and via collaborations with external experts, partner organizations, and policy makers.

Across both focus areas, there will be significant scope to contribute to the overall vision and strategy of the science of evaluations team as an early hire. You’ll receive coaching from your manager and mentorship from the research directors at AISI (including Geoffrey Irving and Yarin Gal), and work closely with talented Policy / Strategy leads and Research Engineers and Research Scientists.

Responsibilities

This role offers the opportunity to progress deep technical work at the frontier of AI safety and governance. Your work will include:
• Running internal red teaming of testing exercises and adversarial collaborations with the evaluations teams, and developing “sanity checks” to ensure the claims made in our reports are as strong as possible.
• Conducting in-depth analysis of evaluations methodology and results, diagnosing possible sources of uncertainty or bias, to improve our confidence in estimates of AI system capabilities.
• Improving the statistical analysis of evaluations results (e.g. model selection, hypothesis testing, significance testing, uncertainty quantification).
• Developing and implementing internal best-practices and protocols for evaluations and testing exercises.
• Staying well informed of the details and strengths and weaknesses of evaluations across domains in AISI and the state of the art in frontier AI evaluations research more broadly.
• Conducting research on predictive evaluations using the latest techniques from the published literature on AISI’s internal evaluations, as well as conducting novel research to improve these techniques. 
• Writing and editing scientific reports and other materials aimed at diverse audiences, focusing on synthesising empirical results and recommendations to key decision-makers, ensuring high standards in clarity, precision, and style.

Person Specification

To set you up for success, we are looking for some of the following skills, experience and attitudes, but we are flexible in shaping the role to your background and expertise.
• Experience working within a world-leading team in machine learning or a related field (e.g. multiple first author publications at top-tier conferences).
• Strong track-record of academic excellence (e.g. PhD in a technical field and/or spotlight papers at top-tier conferences).
• Comprehensive understanding of large language models (e.g. GPT-4). This includes both a broad understanding of the literature, hands-on experience with designing and running evaluations, scaling laws, fine-tuning, scaffolding, prompting.
• Broad experience in empirical research methodologies, potentially in fields outside of machine learning, and statistical analysis ( T-shaped: some deep knowledge, lots of shallow knowledge, in e.g. experimental design, A/B testing, Bayesian inference, model selection, hypothesis testing, significance testing).
• Deeply care about methodological and statistical rigor, balanced with pragmatism, and willingness to get into the weeds.
• Experience with data visualization and presentation.
• Proven track record of excellent scientific writing and communication, with a bility to understand and communicate complex technical concepts for non-technical stakeholders and synthesize scientific results into compelling narratives.
• Motivated to conduct technical research with an emphasis on direct policy impact rather than exploring novel ideas.
• Have a sense of mission, urgency, and responsibility for success, demonstrating problem-solving abilities and preparedness to acquire any missing knowledge necessary to get the job done.
• Ability to work autonomously and in a self-directed way with high agency, thriving in a constantly changing environment and a steadily growing team.
• Bring your own voice and experience but also an eagerness to support your colleagues together with a willingness to do whatever is necessary for the team’s success.

Salary & Benefits

Role

We are hiring individuals at all ranges of seniority and experience within the research unit, and this advert allows you to apply for any of the roles within this range. We will discuss and calibrate with you as part of the process. The full range of salaries available is as follows:
• L4: £85,000 - £95,000
• L5: £105,000 - £115,000
• L6: £125,000 - £135,000
• L7: £145,000

The Department for Science, Innovation and Technology offers a competitive mix of benefits including:
• A culture of flexible working, such as job sharing, homeworking and compressed hours.
• Automatic enrolment into the Civil Service Pension Scheme , with an average employer contribution of 27%.
• A minimum of 25 days of paid annual leave, increasing by 1 day per year up to a maximum of 30.
• An extensive range of learning & professional development opportunities, which all staff are actively encouraged to pursue.
• Access to a range of retail, travel and lifestyle employee discounts.
• The Department operates a discretionary hybrid working policy, which provides for a combination of working hours from your place of work and from your home in the UK. The current expectation for staff is to attend the office or non-home based location for 40-60% of the time over the accounting period.

Selection Process

In accordance with the Civil Service Commission rules, the following list contains all selection criteria for the interview process.

The interview process may vary candidate to candidate, however, you should expect a typical process to include some technical proficiency tests, discussions with a cross-section of our team at AISI (including non-technical staff), conversations with your workstream lead. The process will culminate in a conversation with members of the senior team here at AISI.

Candidates should expect to go through some or all of the following stages once an application has been submitted:
• Initial interview
• Technical take home test
• Second interview and review of take home test
• Third interview
• Final interview with members of the senior team

Required Experience

We select based on skills and experience regarding the following areas:
• Empirical research and statistical analysis
• Frontier AI model architecture, training, evaluation knowledge
• AI safety research knowledge
• Written communication
• Verbal communication
• Research problem selection

Additional Information

Internal Fraud Database

The Internal Fraud function of the Fraud, Error, Debt and Grants Function at the Cabinet Office processes details of civil servants who have been dismissed for committing internal fraud, or who would have been dismissed had they not resigned. The Cabinet Office receives the details from participating government organisations of civil servants who have been dismissed, or who would have been dismissed had they not resigned, for internal fraud. In instances such as this, civil servants are then banned for 5 years from further employment in the civil service. The Cabinet Office then processes this data and discloses a limited dataset back to DLUHC as a participating government organisations. DLUHC then carry out the pre employment checks so as to detect instances where known fraudsters are attempting to reapply for roles in the civil service. In this way, the policy is ensured and the repetition of internal fraud is prevented. For more information please see - Internal Fraud Register.

Security Successful candidates must undergo a criminal record check. Successful candidates must meet the security requirements before they can be appointed. The level of security needed is counter-terrorist check (opens in a new window) . See our vetting charter (opens in a new window) . People working with government assets must complete baseline personnel security standard (opens in new window) checks.

Nationality Requirements

We may be able to offer roles to applicant from any nationality or background. As such we encourage you to apply even if you do not meet the standard nationality requirements (opens in a new window).

Working for the Civil Service

The Civil Service Code (opens in a new window) sets out the standards of behaviour expected of civil servants. We recruit by merit on the basis of fair and open competition, as outlined in the Civil Service Commission's recruitment principles (opens in a new window) . The Civil Service embraces diversity and promotes equal opportunities. As such, we run a Disability Confident Scheme (DCS) for candidates with disabilities who meet the minimum selection criteria. The Civil Service also offers a Redeployment Interview Scheme to civil servants who are at risk of redundancy, and who meet the minimum requirements for the advertised vacancy.

Diversity and Inclusion

The Civil Service is committed to attract, retain and invest in talent wherever it is found. To learn more please see the Civil Service People Plan (opens in a new window) and the Civil Service Diversity and Inclusion Strategy (opens in a new window)

Complete form below to directly Send your CV / Linkedin Profile to Research Scientist - Science of Evaluations at AI Safety Institute.
@
You will receive all responses from employer on this email
Example: Application for the post of 'Accountant'
Example: Introduce your self and give purpose of your application
*All fields are mandatory.
AI SAFETY INSTITUTE
2 jobs found
Research Scientist - Science of Evaluations at AI Safety Institute
London, United Kingdom
Research Scientist - Science of Evaluations at AI Safety Institute
London, United Kingdom
1