Assessing material
performance in the fab

Once all relevant data is made available and accessible, you can analyze it. Athinia® offers a set of common analyses methods already built-in. Custom methods can be built on top.
Common analyses methods used in the semiconductor fabrication environment for investigation and troubleshooting. Examples of outputs from JMP, the most commonly used statistical analysis software in the industry, are included for illustrative purposes.
Analyses Methods:
  • Mosaic plots / Contingency tables for categorical variables
  • Box plots
  • One Way ANOVA
  • Mean's testing
  • Simple and multiple linear regression (incorporating linearization/transformation of data, if relevant)

A wide array of composable templates for self-service, no-code data analytics

Data preparation: Joins and aggregations
Testing of Assumptions: Suggestion of best methodology for shape of data
Feature reduction:
  • KernelPCA
  • PCA
  • PLS
  • Correlation
  • TSNE
  • Lasso Regression
Descriptive/Predictive models:
  • OLS Linear Regression
  • Logistic Regression
  • XGBoost Regressor
  • Random Forest
  • Box Plots
  • Scatter Plots
  • PCA Feature Importance
  • SHAP Feature Importance
  • Correlation Matrix

Recommendation engine suggests model and surfaces data issues

  • Customer datasets are tested against hundreds of assumptions, curated and maintained by our data science team
  • Tool clearly identifies limitations in the source data and suggests improvements
  • Suitable algorithms are highlighted and can be deployed with a few clicks
  • Assumptions are tested continuously as the underlying data updates
The most granular level for collecting and aggregating fab data as it relates to process materials is at the wafer. As variations in cross-wafer uniformity are generally caused by equipment-level issues, summary statistics for multiple data points collected across individual wafers should suffice for establishing relationships to material performance. However, not all wafers are subject to measurement at each inline metrology step, so data must also be aggregated at a lot level - lots are batches of 25 wafers carried together in the same wafer transport pod through the fab, and processed together under the same conditions - and assumed representative for all wafers in the lot.
Generally, the fab data that has the strongest interactions with material performance are as follows:
  • that related to the state and quality of the incoming wafer, as defined by the immediate prior process and related metrology step.
  • that from the process and subsequent metrology step at which the material is used.
  • that from the immediate downstream process
  • subsequent inline electrical tests that measure device characteristics related to the structures formed through the integration scheme incorporating the material.
  • that from final device electrical characterization (probe data) and wafer yield

Schematic overview of relevant fab process and measurement steps

When collected, aggregated, and meshed, the fab level data has the basic structure shown in Table 1.  Process and equipment sensor data is typically available for each wafer. Metrology data is more sparse at the wafer level, though should exist for each lot. When aggregated together, one material batch will correspond to several lots and many wafers. The fab data typically consist of two types, both of which will be included in the aggregated data:
  • Categorical: identifiers for the specific tool, specific process chamber, other relevant identifiers, or descriptions
  • Numerical: typically summary statistics of the process and/or equipment sensor data, e.g. min, max, mean, etc
Table 1.  Basic structure of fab data that relates to specific process material.
Material Batch Lot ID Wafer ID Prior Step Current Step Next Step Material Batch Electrical Test 1 Electrical Test 2 Electrical Test 3 Final Yield
M001 L001 W002 x-circle              
M001 L001 W005 x-circle              
M001 L001 W009 x-circle              
M001 L001 W014 x-circle              
M001 L001 W019                
M001 L001 W024                
M001 L002 W003                
M001 LXXX                  
M002 LXXY                  
M002 LYYY                  
M003 LYYZ                  
The journey towards understanding how variations from the materials affect the fab start with this data.  A quality or process engineer will take this type of dataset and start an investigation by determining if there are specific pathways that wafers with the material of interest move through - higher frequencies of passing through certain tools, chambers, etc. This is looking for relationships between categorical variables. Material batch IDs, lot IDs and wafer IDs will be compared against equipment IDs, chamber IDs, etc. In JMP, engineers run this analysis by creating Mosaic Plots and Contingency Tables (see example images to the right).1

Example of Mosaic Plot and Contingency Table from JMP2

If there are any clear clusters or patterns here, engineers may analyze subsets of the data group by specific pathways. Otherwise, this information is retained for future reference.
Next, process behavior and performance will be investigated at the batch, lot and potentially wafer levels. Typically, data will first be visualized with box plots to get a sense of variability within and across lots. Sometimes, histograms will be plotted alongside box plots.

Examples of box plots from JMP3

After visual inspection, One Way ANOVA will be used to statistically determine equivalence of lots and/or material batches. Scatter plots of datapoints vs batches/lots/wafers ordered by time will also be used to identify drift over time. Difference across batches and/or lots are noted for deeper exploration. It should be noted here that methods such as PCA and PLS are not commonly deployed to find unexpected relationships. Neither are more unstructured clustering methods. In short, engineering judgement and means testing typically serve to establish correlations and subgroupings for deeper analyses.

ANOVA and Means Testing output on JMP

Relationships between process and equipment data and related metrology and electrical test/yield data is established through regression. One response variable is usually tested at a time, with both single and multiple linear regression. Non-linear relationships are typically approximated by a linearization / transformation of explanatory variables, usually based on known engineering or scientific relationships, e.g. using 1/T instead of directly using temperature, or t2 instead of time. Quality of such regressions are judged by r2 and Pvalue measures and confidence intervals.

Example of an output of multiple regression analysis in JMP4

Results of regression analyses are not typically used as precise predictive models, but as directional guides to determine specific relationships and suggest process or setup changes. Note, second order interactions between explanatory variables are often neglected.
Assessing material data at the supplier side
The statistical methods used by fab engineers described above are also very commonly used by material suppliers. However, the use of additional methods such as PCA and PLS are growing, due to greater challenges in uncovering and interpreting relationships across different types of data in the supplier's arena. Here we describe the data that suppliers will collect, and some of the analyses they will attempt before sharing with a device maker.
  • The data that material suppliers work with tend to be more varied and less standardized than fab data. Raw material certificate of analysis (CoA) data
  • Raw material characterization data
  • Formulation (charge) charts
  • Equipment and sensor data
  • Packaging / bottling data
  • Performance and quality data
  • Final product CoA data

Raw Material CoA data / Characterization data

These data are summary statistics and outputs of quality control checks performed by the raw material (N-2) supplier. The checks include chemical, compositional, quality, and observational characteristics of the material. N-2 suppliers may maintain some records of these measures, but the data is likely not digitized or easily accessible.  CoA has the general form of the table below.
Characteristic Measure Specification
Appearance Powder Powder
Color White White
Molecular weight 550685 550000 – 510000
Infrared spectrum Conforms to structure Conforms
Metals 0.8 ppb < 1ppb
Particles 8 counts <0.1 µm < 10 counts <0.1 µm
Due to constraints and limitations at the raw material suppliers, the N-1 material supplier may conduct supplementary measurements. These could be additional checks on metals, particles or other types of impurities, or chemical / compositional checks such as gas chromatography mass spectrometry (GCMS), UV-Vis spectrometry, or nuclear magnetic resonance (NMR). Example data is shown below.