Data Engineer III

Apply for the Position

The CW will support efforts to run CSAM filtering and other mitigations no large scale datasets (image, video, text) leveraged by FAIR research teams. The goal is to proactively mitigate potential risks associated with these datasets.


Specifically, the CW will assist in the following tasks:

Preprocessing: converting original datasets into a format that can be consumed by CSAM filtering and other pipelines.

Filtering: running filtering using Integrity's pipeline or Spidermate's pipeline.

Post-processing: consuming filtering results to filter in the original datasets, repackaging, and re-ingestion.


To complete the tasks outlined in the scope of work, the CW will need to have the following skills:

Technical skills: Experience building and maintaining data pipelines, large scale data transfers, knowledge of data storage solutions;

Data Management: Data preprocessing and cleaning, transformation and formatting, data quality control and validation

Communication - effective communication skills to collaborate with stakeholders and team members

Software engineering skills including writing scripts to automate file processing and data transferring, and creating tools to improve productivity and streamline workflows.


Strongly prefer CWs with prior experience supporting data efforts at Meta (full 2 years), as this would enable them to leverage their existing knowledge of Meta's internal tools and processes, facilitating a faster onboarding process and more effective contribution to the FAIR CSAM mitigation efforts.


Location
Remote Daly City
Empolyement Basis
Contract
Salaray range
Salary Range
** Important **
Heading