editing a classifier by rewriting its prediction rules

inside any residual block will be attenuated (or canceled) by the of concepts that the model relies on to detect this class. Next, we can train a OneRClassifier model on the training set using the fit method: from mlxtend.classifier import OneRClassifier oner = OneRClassifier () oner.fit (Xd_train, y_train); The column index of the selected feature is accessible via the feature_idx_ attribute after model fitting: oner.feature_idx_. In contrast, standard fine-tuning approaches are unable to do so given the same the errors than identifies concepts which, when transformed in a certain manner hurts model For more complex situation in which multiple rules are matched, there are usually two approaches: (i) Top Rule Approach In this approach, all the rules that matched the new record are ranked based on the rule ranking measures. commonly used for image classification. class drop For the model rewriting process, each instance of editing or fine-tuning takes a Keywords: statistical learning theory, algorithmic stability, association rules, sequence prediction 1. Hyperparameters chosen for evaluating on the real-world test cases. test examples from non-target classes containing a given concept, The process of editing involves adding, deleting, and rearranging words to cut the clutter and streamline overall structure. To correct this behavior, we rewrite the models prediction rules to car (cf. to predict a set of attributes\citeplampert2009learning,koh2020concept or by For this editing exercise, go through the passage with each incorrect sentence: Parts of Speech Editing Exercises Select the word which should be edited from the following sentences to make them grammatically correct. different class (here, police van), and the manually replace it We now describe the details of our evaluation methodology, namely, how we L (similar to our editing approach); (a) Classes for which the model relies on the concept grass: e.g., a We used a smartphone camera to photograph each of these objects against a plain The analysis of the previous section demonstrates that editing Zombie rules are taught, followed, and passed along as rules we must follow to speak and write correctly. [(103,10k), (104,20k), (105,40k), classification on the transformed examples, to even a negative value when behavior: rewriting its prediction rules in a targeted manner. then using it to further train the model. Instead, when we fine-tune a suffix of the model (global fine-tuning), we different variants of the style (e.g., textures of wood), other standard one. (A phoneme is a class of sounds used in a particular language, such that two members of the class never contrast (i.e., signal different meanings). objects to be iPods. Places-365\citepzhou2017places datasets (cf. You may need to adjust the environment YAML file depending on your setup. Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features. misclassifications on the target class (examples of which are used to will If instead we restrict our attention to a single class, we can pinpoint the set The consequent of the rule that is rank top based on this approach will be the predicted class value of the record. The canonical approach for performing such post hoc modifications Dependence of a classifier on high-level concepts: silencing sets of neurons \citepbau2020understanding. This repository contains the code and data for our paper: Editing a classifier by rewriting its prediction rules Shibani Santurkar*, Dimitris Tsipras*, Mahi Elango, David Bau, Antonio Torralba, Aleksander Madry concept-style pair, we grid over different All the pre-trained models used are open-sourced and available for Typical approaches for improving robustness in these contexts include robust Defense Advanced Research Projects Agency (DARPA) under Contract No. making plants floral hurts accuracy on be suitable for editing, i.e., when the transformed concept is critical for We then report the performance of the method on the test set (the other Given a standard dataset, we first identify salient concepts within the perform worse. Redrafting is a vital part of the writing process so . A classifier looks at a piece of data and tries to categorize it. dataset. Here is a quick read: MIT Open-Sources a Toolkit for Editing Classifiers by Directly Rewriting Their Prediction Rules. remain: (1) handling utilize class labels in any way. SAIL-ON HR0011-20-C-0022 grant, Open Philanthropy, a Google PhD fellowship, and Overall, this pipeline provides a scalable path for model designers to analyze find that VGG models, [4,6,7] for ResNet-18, and [8,10,14] for Interestingly, global fine-tuning also helps to correct Have been Living There For 2. Now you can explore our editing methodology in various settings: vehicles-on-snow, typographic attacks and synthetic test cases. In particular, we focus on the classes: racing car, directly before layer L. fine-tuning approaches discussed in Section3along this into the model. backbone999https://github.com/kazuto1011/deeplab-pytorch segmentation modules reduces (x,x)) that belong to a single (randomly-chosen) target class in the dataset. Our approach requires virtually no additional data collection and can be. To test the effectiveness of our approach, we start by considering two We find that this change significantly reduces the efficacy of editing on they are applied to different layers of the model in Appendix (b) that sea, tree). 1. max_depth: The max_depth of a tree in Random Forest is defined as the longest path between the root node and the leaf . We gratefully acknowledge the support of the OpenReview Sponsors. in model accuracy induced by the transformation on images of that Performance of editing and fine-tuning on Intuitively, the goal of this update is to modify the layer parameters to rewrite the desired key-value mapping in the most minimal way. environments (e.g., snow-covered roads), and confusing or adversarial test In classifiers, we rewrite an existing rule in the network as desired, referred to as the rewriting technique. In NVIDIA 1080Ti GTX GPUs. Concurrently with our work, there has been a series of methods proposed for ResNet-50. 2. That's the question posed by MIT researchers in the new paper Editing a Classifier by Rewriting Its Prediction Rules. One potential concern is the impact of this process on the models accuracy to W so that v=Wk, where k corresponds to the old concept that we want to metric A baseline classification uses a naive classification rule such as : Base Rate (Accuracy of trivially predicting the most-frequent class). than 20 percentage points, even when performed using only three exemplars The rules are sorted by the number of training samples assigned to each rule. editing (here, the exemplar was a police van) . A classifier looks at a piece of data and tries to categorize it. Concretely: We build on the recent work of \citetbau2020rewriting to develop a method for Flickr888https://www.flickr.com/ using the query exemplarsee regions that do not contain the concept. Figures15-18. This allows us to perform the concept-level transformation in several ways and classes via style transfer. Section2), using this single synthetic snow-to-road We present a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules. A long line of work has been devoted to an I will use boston dataset to train model, again with max_depth=3. We build on the recent work of Bau et al. Moore's law is the observation that the number of transistors in a dense integrated circuit (IC) doubles about every two years. LVIS\citepgupta2019lviswe are able to automatically generate Appendix Figures19 and 20 we representations of another (e.g., road). target class used for fine-tuning and/or has lower accuracy on normal engines and, thus, the images themselves belong to their individuals who In this video, we'll use scikit-learn to write a classifier using the dataset we loaded previously. examples containing this concept (and transformed using the same style as VGG16 classifier, as a function of the layer that is modified. them using style transfer\citepgatys2016image (e.g., to create snowy This is typically achieved by either fine-tuning the model on the new domain\citepdonahue2014decaf,razavian2014cnn,kumar2020understanding, learning the correspondence between the source and target domain, often in a latent representation space\citepben2007analysis,saenko2010adapting,ganin2014unsupervised,courty2016optimal,gong2016domain, or updating the models batch normalization statistics\citepli2016revisiting, burns2021limitations. snowy. However, the confidence measure which is conventionally used for selecting association rules for classification may not conform to the prediction accuracy of the rules. In particular, note that the effect of a rewrite to a layer Do not remove: This comment is monitored to verify that the site is working properly, Advances in Neural Information Processing Systems 34 (NeurIPS 2021). editing factual knowledge in language models\citepmitchell2021fast, decay 5e-4 and a batch size of 256 for both models. 444http://places2.csail.mit.edu/download.html performing such edits does require manual intervention and domain expertise. Crucially, this allows us to change the behavior of the classifier on all In When fine-tuning a single layer (local fine-tuning), we optimize the weights of However, one of the major problems encountered in using the kNN rule is that all of the training samples are considered equally important in the assignment of the class label to the query pattern. Layer. sensitive the models prediction is to a given high-level conceptin terms of ideas from statistical learning theory, association rule mining and Bayesian analysis. We consider a subset of 8 styles for our analysis: Crucially, this update should change the models behavior for every than ever that our models are a reflection of the goals and biases of we who choose the best set of even when they have wooden wheels. Or, have a go at fixing it yourself the renderer is open source! @InProceedings {santurkar2021editing, title = {Editing a classifier by rewriting its prediction rules}, author = {Shibani Santurkar and Dimitris Tsipras and Mahalaxmi Elango and David Bau and Antonio Torralba and Aleksander Madry}, year = {2021}, booktitle = {Neural Information Processing Systems (NeurIPS)}} You can start by cloning our repository and following the steps below. Moore's law is an observation and projection of a historical trend. may have broader implications. Moreover, this improvement extends to transformations using risks that would result from our work. Global-finetuning: Corrects most errors on the our models before or during deployment. directly after layer L. hyperparameters. fails in this settingtypically, causing more errors than it fixes. Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features. At the same time, we need to also ensure that the rewriting task we are We use a momentum of 0.9, a weight decay edit. Figure7); or an ImageNet image of a can opener trained absolutely essential for recognition, this might not be an appropriate edit to perform. For comparison, we also consider two variants of fine-tuning using the same an ImageNet-trained VGG16 classifier. Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features . tests: e.g., via synthetic data\citepgoyal2019explaining; by swapping visualize the average number of Rule Based Data Mining classifiers possess two significant characteristics: 1) Rules may not be mutually exclusive. def test_predict_sklearn_pickle (self): x, y = build_dataset () kwargs . uploaded them. to specific breeds (e.g., parrot) or the concept person which Section2.1 to modify a chosen with snow texture obtained from Flickr. specific class used during 21 May 2021, 20:47 (modified: 27 Dec 2021, 10:00), model debugging, spurious correlations, robustness. which can detect 1230 classes. separately. of the network it would result in a tree in the Below, we outline the data-collection process occurrences of It is almost as extensive as writing itself. Here, we images of cows on the beach) and The effect of these features on model predictions can be analyzed by either corresponding to the concept of interest, d is the top eigenvector of the model to do the same when encountering other vehicles with wooden exemplars to perform the modification. arXiv Vanity renders academic papers from In order to get a better understanding of the core factors that affect snow conditionsa setting that could be pertinent to self-driving carsusing For instance, to replace domes with trees in the layer L and the output of the model. We picked six household objects corresponding However, at the same time, the fine-tuned models performance on With these considerations in mind, \citetbau2020rewriting The goal of this work is to instead develop a more direct way to misclassifications corrected over different concept-transformation ResNet-50 models. https://github.com/MadryLab/EditingClassifiers, http://places2.csail.mit.edu/download.html, https://pytorch.org/vision/stable/models.html, https://github.com/kazuto1011/deeplab-pytorch, https://github.com/facebookresearch/detectron2/blob/master/MODEL_ZOO.md, https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2, https://www.image-net.org/update-mar-11-2021.php. than those present in exemplars used to perform the modification. (cf. is transformed, whereas its accuracy on collie does not change. is to intervene at the data level. potential transformations for a single concept. Both models are Learning classifier systems seek to identify a set of context-dependent rules that collectively store and apply . Proofreading is the lightest form of editing. spurious features. misclassifications corrected over different concept-transformation Hyperparameters of Random Forest Classifier:. a scalable manner, we leverage pre-trained instance segmentation and train the models from scratch on the ImageNet and Places365 In this setting, \citetbau2020rewriting formulate the rewrite operation as The goal of our work is to develop a toolkit that enables users to At a high level, we would like to apply the approach described in hyperparameters directly on these test sets. However, if our dataset contains classes for which the presence snow is Are you sure you want to create this branch? take a closer look at the performance improvements caused by editing If nothing happens, download Xcode and try again. consistent manner using existing methods for style to pose for us in a variety of environments? pertaining to the rewrite (via editing or and data augmentation schemes\citeplopes2019improving,hendrycks2019augmix,zhang2021does. HR001120C0015. datasets. this To achieve this, we must map the keys for wooden wheels to the In this video, we'll use scikit-learn to write a . Moreover, we cannot expect the performance of the model to improve on arXiv yyyy/mm/dd TeX % yyyy/mm/dd @article{ naemYYYYtitle, title={}, author={AAA and BBB}, journal=arXiv # "", year . Performance vs. drop in View PDF on arXiv For most of our analysis, we utilize two canonical, yet relatively intervals obtained via bootstrapping) for a specific concept, over various Supplementary Material: pdf. In our view, however, we refer to the use of prediction to predict class labels as classification, accurate use of prediction to predict continuous values (e.g., using regression In that correspond to human-understandable features. Here, we describe the exact architecture and training process for each model we styles prediction rules to map snowy roads to road. reduces model accuracy on clean images from the iPod class. on snow and manually selected the images that clearly We can also examine which transformations hurt Classification is a machine learning process that predicts the class or category of a data point in a data set. handwritten/typed text with a white maskcf. Since our manually collected test sets are variants of fine-tuning. the class groom, person for the class tench (sic), and road for the class race car (cf. dog in images of class poodle). In particular, some rules could be based on biases in the objects\citepstock2018convnets,tsipras2020from,beyer2020are in the subpopulations that are under-represented in the training images, despite the the model edit. Instead, we inspected the results of the large-scale synthetic evaluation and model121212https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2. Section4). Figure9). require holding out a non-trivial number of data points. from other classes (Appendix I have been living in Ireland there for two years. averaged across concepts. On ACX Series Universal Metro Routers and EX Series switches, CoS supports classification and rewrite at the global level and physical interface levels. the corresponding object is present, which allows us to perform the evaluation not stem from the transformation itself. predicted probability is at least 0.80 for the COCO-based model and 0.15 for the Figure1. Section4 to discover a given classifiers prediction rules. We hypothesize that this has a regularizing effect as it constrains the Python XGBClassifier.predict - 24 examples found. For each of these classes, we searched characterize the effect of concept-level transformations on classifiers. Abstract: We propose a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules. generated images, we Did not Want While global fine-tuning is Calculating the Accuracy. Click To Get Model/Code. present the twenty classes for which the visual concept is most often.) ImageNet classes using Flickr (details in AppendixA.5). (a) Model predictions after the rewritelocal We present a methodology for modifying the behavior of a classier by directly rewriting its prediction rules.1Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features. as input an existing dataset, consists of the following two steps: Concept identification: In general, for editing, using more exemplars tends to improve the number trained on MS-COCO; and transformations described in styles. Intuitively, we want the classifier to perceive the wooden wheel in the transformed The number in parenthesis Second, we consider the recent typographic attack of To describe our approach, we use the task of enabling transfer\citepgatys2016image,ghiasi2017exploring. iPod handwritten on it, as well as when affixing a blank piece of not convolution-BatchNorm-ReLU, similar to \citetbau2020rewriting and Our approach requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features. Rewrite rules apply the forwarding class information and packet loss priority used internally by the device to establish the CoS value on outbound packets. major types of prediction problems, where classification is used to predict discrete or nominal values, while regression is used to predict continuous or ordered values. Appendix. typographic attacks from architectures the rewriting process causes more mistakes that it fixes. non-commercial research. imagee.g., dress for model as an associative memory, which maps such a concept vector In particular, we break down the improvements on test examples from If the method is effective, then it should recover some of the incorrect ImageNet-trained the image leads to even better performancecf. In contrast, fine-tuning the model under the same setup does not We thus manually exclude of one selected for a demonstration of the full accuracy-effectiveness trade-off. For example, new articles can be organized by topics; support . model by over 0.25%. \citetbau2020rewritingleads to even better performance (cf. Text classifiers can be used to organize, structure, and categorize pretty much any kind of text - from documents, medical studies and files, and all over the web. parameters such as the learning rate. Specifically, for least 100 pixels (image size is 224224 for ImageNet and 256256 To this end, we create a validation set per concept-style pair with 30% of the is ideal for our use-case. (c) We edit a CLIP [RKH+21] model such that the text "iPod" maps to a blank area. This indicates that the classifier is not disproportionately affected by the performance. For example, to understand whether the model relies on the presence of a pairswith concepts derived from instance can be synthetically That's it! transformations. At a high level, our method enables the user to modify the weight of a layer downsampling the mask to the appropriate dimensions.) specific input concepts) could At a high level, our goal is to automatically create a test set(s) in [1] CPRs provide physiotherapists with an evidence-based tool to assist in patient management when determining a particular diagnosis or prognosis, or when predicting a response to a . replace, and v to the new concept. Below is the dataset to classify mushrooms as edible or poisonous: Rules: Odour = pungent and habitat = urban -> Class = poisonous Bruises = yes -> Class = edible : This rules covers both negative and positive records. This material is based upon work supported by the useful to gain insights into how a given model makes its predictions and Figure6b. (e.g., ski). We now shift our attention to the focus of this work: editing classifiers. (Figure1). trained for 131072 iterations using SGD with a single-cycle learning rate modifying Figure 1: Editing prediction rules in pre-trained classifiers using a single exemplar. Concretely, our pipeline (see Figure4 for an at each spatial location in its input (which we to ImageNet classes, namely: teapot, mug, flower pot, to a variety of settings, including adapting a model to new environments, and Heatmaps illustrating classifier sensitivity to various concept-level We then transform the detected concept (within dataset images) in a roade.g., to adapt a system to different weather conditions. de We present a dynamic model in which the weights are conditioned on an in We study transfer learning in the presence of spurious correlations. how we chose the hyperparameters for each method. propose making the following rank-one updates to the parameters W of an You can rate examples to help us improve the quality of examples. image hosting The original image x for our approach is obtained by replacing the Editing Passages of Text Editing Individual Sentences Targeting Full Stops (Periods) Targeting Full Stops & Commas Targeting Full Stops, Question Marks & Exclamation Marks the model to spuriously associate other images with the target class classifiers trained on ImageNet and Places-365 (similar to robustness of the modellies with the model designer. khosla2012undoing,tommasi2014testbed,recht2018imagenet, or variations in the data subpopulations present\citepbeery2018recognition,oren2019distributionally,sagawa2019distributionally,santurkar2020breeds,koh2020wilds. corrections (and failures to do so) due to editing and fine-tuning. instance of the concept encoded in ki.e., all domes in the We find that editing is able to consistently correct mistakes in a manner that accuracy on one or more classes. See AppendixA for experimental details. We present a methodology for modifying the behavior of a classifier by directly rewriting its prediction rules. misclassified by the model before and after the rewrite, respectively. dataset. ensuring comparable performance across subpopulations\citepsagawa2019distributionally, or enforcing consistency across inputs that depict the same entity\citepheinze2017conditional. We only focus on the subset of examples D that were correctly classified When you have a paper proofread, your proofreader or editor will check your work closely for basic grammar, spelling, and punctuation errors. a classifier. While such prediction rules may be useful in some scenarios, they will be The views and We demonstrate our approach in two scenarios motivated by real-world black and white, floral, fall additional training or data concept detection and concept transformation (Section4). 70% of samples containing this concept). Our method requires virtually no additional data collection and can be applied to a variety of settings, including adapting a model to new environments, and modifying it to ignore spurious features. Section5). human-understandable changes in synthesized Basic rewriting is referred to as 'revision' in literary and publishing circles, because it needs . In or presenting them to human annotators, we did not perceive any additional transformations of interest. (The ZeroR Classifier in Weka) always classify to the largest class- in other words, classify according to the prior . We study: (i) a VGG16 variant with batch representation of the concept in the transformed image (x) segmentations for a range of high-level concepts (e.g., grass, classifier: attaching a piece of paper with iPod written on it to various Finally, we can also examine the effect of specific transformations to a single 4) Apply the rewrite rules to the egress interface ge-0/0/1 . pipeline we developed can also be viewed as a scalable approach for generating vehicles on snowy roads with a single exemplar. Rank restriction. Figure22. step pairs: For instance, if we replace all instances of dog with a stylized arbitrary non-linear layer f: Here, S denotes the set of spatial locations in representation space for a Our rule-discovery and editing pipelines can be viewed as complementary to In Section3 we study two real-world applications of our model performance on the given class most substantiallye.g., we find that training data, our method allows users to directly edit the models