ICLR 2019

Debugging Machine Learning Models

Opening remarks
- Lots of exciting paper on debugging spread across conferences; bringing them together into a single place.
- Sponsored by Google and OpenAI.
A new perspective on adversarial perturbations
- ML tends to be extremely brittle; eg. adding noise makes it break.
- Robustness is "just" about getting better models.
  - Look at it like a computer.
- Make adversarial examples manually
  - Perturbations are actually features for the models
- Relabel the image as the target class
- Train a model on the broken data, and test against original test set
- Robust features – correlated with label, even with adversary
- Non-robust features – can be flipped by using an adversary
  - Good for generalization
- The consequence from this is transferability
  - features are a property of the dataset, not the model
  - any other models will pick up these features
- Human vs ML model priors: no reason for model to move to human priors
- No hope for interpretable models without intervention at training time
- Robust optimization: add max over all perturbations for x over loss
- Model can't depend on anything that changes too much within delta
- Explicitly break the reliance on robust features
- Can restrict features to robust features
- Then training normally on the new set creates a robust model
- Properties of robust models
  - More semantically aligned
- Synthesis
  - Integrity should probably be using these approaches a lot?
  - This feels like a form of regularization; do we get similar results by dramatically increasing the training set with similar perturbations / is this doing localized gradient descent to figure out the maximum perturbation value to speed it up?
- Follow ups
  - Will I get similar behavior by simply generating n-times the training values?
  - How do you generate the adversarial perturbations?
  - How does this affect differentiability of the loss?
  - Why is it better to generate the test set than modify the training pipeline?
  - Interpretability at training time: enforce it at training time, instead of after handing over the model.
  - Actually read the paper
Similarity of Neural Network Representations Revisited
- Tools to understand trained neural networks
- Comparing them allows understanding them, particularly after tweaking the network
- Comparing representations:
  - Compare every pair of features, eg. with a dot product X^TY
  - Compare examples between two layers, dot product XX^T
    - Representational similarity matrix
    - Compare these matrices – dot product of reshaped matrices
- Comparing features = comparing examples
- Created a similarity index to normalize the value
  - Avoid scaling the solution
  - "Centered kernel alignment"
  - "RV Coefficient"
  - "Tucker's confluence coefficient"
- Replace the dot product with the kernel
- Evaluate the similarity: compare 2 networks of the same network
- Helps reveal network pathology
- CKA shows similar layers inside the network
- Logistic regression on each layer? (to compare layers)
- Synthesis
  - Can use this to compare predictor nets / blob distributions a little bit more cleanly, and presumably much faster
  - Not sure how useful this is for explainability, it seems very complex and doesn't give me a lot of intuition around what's happening because it's so complex.
  - This also won the best research paper award, so clearly I'm missing something important.
- Follow ups
  - read the paper and try to work through/implement it, clearly didn't understand the underlying mathematics
Error terrain analysis for ML
- Model evaluation and testing ends up hiding the errors during features, training.
- Models are only evaluated on a single number, which can hide certain conditions of failure
- Systematically identify errors rigorously
- Different regions of the dataset can fail very differently with very different reasons
- Creating good swe practices for defining systems
- Identify errors more systematically
- Have been doing this for ~3 years: failure explanations, unknown unknowns
  - Software Engineering for Machine learning
- Benchmark: run model against benchmark, provide instances and error labels (performance indicators on each instance) – tells how the model is doing
  - Plus additional valuable features
- Allows inspecting the underlying raw data to see how it was being converted
- Synthesis
  
  Given a benchmark dataset with a set of features you care about for detecting bias. Then see how your model runs against that dataset, breaking down the performance by those features to understand and explore biases in the dataset.
- Follow up
  - Software Engineering for Machine learning paper microsoft
SKIP Verifiable reinforcement learning via policy extraction
Debugging Machine Learning via Model Assertions
How do you do quality assurance over models?

Model assertions
- both at test time and train time
- study both soft and exact assertions
  - eg. item boxes shouldn't flicker
  - eg. item boxes shouldn't overlap
At training time use assertions for
- active learning
- weak supervision
At inference time
- runtime monitoring
- corrective actions
Apply a bandit algorithm to select model assertions with highest marginal gain.
- mean average precision: ranking metric
- rank orders boxes and computes precision at several recall levels
- can they catch "Confident mistakes"?
- they catch high confidence errors missed by uncertainty sampling
- Synthesis
  - Dawn is responsible for HogWild!
- Follow up
  - Learn about and implement Hogwild
  - What is a Bandit algorithm?
  - What is a "confident" mistake?
  - What is uncertainty sampling
  - Read machine learning testing paper
SKIP Improving jobseeker match models
Discovering natural bugs with adversarial perturbations
- Black box explanations for debugging
- Perturb it in a specific way, and if the original and new prediction don't match up, it's a bug.
- What is the smallest delta I can make to make it flip its decision: adversarial search.
- Can make natural perturbations using GANs
- Semantically equivalent adversaries
- Semantically equivalent adversarial rules
- Adversaries do as well as humans while figuring out these edge cases in the models: but the results aren't exactly the same!
- Automated flipping did better than finding expert rules.
- Synthesis
  - Perturbation is an excellent technique for detecting and fixing issues in the models. I wonder how much Facebook has considered using these things.
SKIP Debugging discriminatory ml systems
Algorithms for verifying deep neural nets
- NeuralVerification.jl – survey of methods for soundly verifying properties of neural networks.
- Define an input / output problem – do the outputs lie in the output set.
- Reachability
- Optimization
- Search
- Follow up
  - Read their actual paper and understand the different systems
TODO Safe and reliable machine learning
Better code for less debugging with autograph
- Google
- Types of defects: programming, modeling, data
- Prevention: unit tests
- Focusing on programming bugs
  - occur in foundations of the ML system
  - can be confounded with other errors
  - can be very difficult to detect
- Prevention
  - Should be cheap with good tools
  - Focus on readability to spot errors, understand intent and catch issues.
- Eg.
  - Tensorflow's eager execution
- portable code for multiple platforms
- Autograph converts function to tensorflow and optimizes them
- Autograph integrates with tensorboard and relies on eager execution.
The scientific method
- hypothesis, expectations, design, stats analysis, uncertainty estimation
- neural networks can be studied as physical objects
- cos.io/prereg
- Synthesis
  - I wonder if our NE improvements qualify as noise; is anyone studying this carefully and trying to understand if the improvement is coming from architectural changes or simply noise in the underlying system?
Don't debug your black box, replace it
- Interpretability can help accuracy
- Blackboxes can be easily miscalculated, and you can't debug them
- Interpretability is a set of constraints
  - loosely or strictly interpretable
  - interpretable != explainable
- why explain a black box if you can produce an interpretable model?
  - explaining a black box: 2 models, one to explain, one black box
  - they must disagree with each other – otherwise there's no point to the black box
- explanations might get variable importance completely wrong
- propublica study made this mistake
- you can come up with an explanation that makes no sense whatsoever
- interpretable neural network
  - add an extra layer at the end of it
  - don't get no sacrifice in accuracy
- Synthesis
  - Is my credit score calculated like this?
  - I should apply this to all the neural networks I build for myself to understand this; making this on top of MNist will be amazing.
- Followups
  - read everything by cynthia at https://users.cs.duke.edu/%7Ecynthia/home.html
The future of ML Debugging
- debugging is a continuous process, and the system needs to be monitored