This is the second part to a three (?) part set of articles covering how to train a sensory panel to perform descriptive profiling. It covers Intensity training and descriptive profiling lexicon (ballot) development.
Part iv: Intensity
As confidence and flavor familiarity builds among your panelists, it will be time to add a new twist to the training regimen: intensity training. If we are to build flavor profiles for our brands, we need to be able to quantify the magnitude of the flavors that are found in the samples, so just as we sought consensus in how our panelists describe the flavors, we must now seek consensus in how we rate their intensities. And just as before, we must use standardized materials and lots of discussions.
The first thing you’ll have to decide when you begin intensity training is what scale you will be using, and care should be taken when deciding so that you do not settle on one that is too narrow or one that is too wide. If a scale is chosen that is too narrow, it will be hard to find differences between samples. An intensity rating of 4 on a six-point scale (0-5) is only one step from a 3, despite being about 17% of the available scale. Similarly, if a wide scale is used, you may find difficulty in getting all your panelists to use the scale the same way; some may prefer to use the upper portion of the scale while others are more comfortable using values at the low end of the scale, leaving room at the upper end for the potential of more extreme samples.
Many sensory scientists like to use an 11-point scale (0-10) because it’s pretty intuitive for us base-10-thinking humans, and it’s pretty close to the “sweet spot” in terms of the size of the scale, as discussed above. For my panels, I prefer to use a 16-point scale (0-15) because it can be used with the Spectrum Method of intensity training, which is a standardized system for intensity training using widely-available materials as anchor points on the intensity scale. For example, the aroma intensity of dehydrated milk powder is considered a 4 on this scale, the aroma intensity of Triscuit crackers is an 8, and the intensity of the cinnamon aroma in Big Red gum is a 12. For each profiling panel, I have examples of these products in small jars on the table for my panelists to reference whenever they need to recalibrate their “intensity meters”. Other examples of products that can be used to anchor the Spectrum intensity scale are found in Sensory Evaluation Techniques, by Meilgaard, Civille & Carr (CRC Press).
Once you’ve chosen a standard intensity scale, it’s time to see what your panel can do with it. I start by spiking a few flavors that they are familiar with into beer. For each flavor that I use, I create a couple of spiked samples at two different concentrations. I choose levels that are known to be distinct enough for your panelists that they should generate intensity ratings with a significant disparity between them. During panel, I present these “intensity pairs” of attributes to the panelists and ask them, using the standardized and accepted intensity scale, what intensity they perceive the added flavor at. As the panel gives their feedback, I record their ratings on a whiteboard so we can all see the variation. If there are any panelists who look like they are outliers from the group, they should be directed back to the intensity standards (like the Big Red gum) and asked to confirm their rating. If they are still rating the flavors much differently from the rest of the group, you have some choices: you can try to get them to bring their scores in line with the panel, or you can accept their uniqueness and let it be included in your data. After all, everyone is genetically unique, and that uniqueness often translates directly to uniqueness in their ability to detect certain flavors. One potential pitfall from the latter approach is particularly troublesome if you have small and fluctuating panel attendance. If, for example, one of your few panelists is very sensitive to the butteriness of diacetyl (tends to rate it high) and you include their normal ratings into the brand profiles, and then later you assess some samples without that panelist present, their lack of influence will be felt in the data by way of artificially lower diacetyl ratings for those samples and your samples will seem “off profile” when in reality you are just missing an influential panelist. Careful attention to the data your panelists generate is essential, and you will likely face several judgement calls along the way.
In addition to presenting spiked samples to your panelists to investigate their usage of the intensity scale, you can present non-spiked samples of your various brands, preferably ones that have specific flavors that stand out from the others – things like coffee beers, or sour beers, beers with flavor extracts or other flavorful additives. These are a good segue into full-on flavor profiling of your brands –just choose a specific flavor to focus on and discuss.
One question that you’ll likely face from the panelists who are paying attention is: “Should I be considering the extreme ends of the scale as ‘what is possible in beer’ or ‘what is possible in anything’?” For example, if we’re rating the intensity of carbonation, do we consider a 15 (or whatever your scale’s maximum value is) to be the highest carbonation found in beer, or the highest carbonation level imaginable? Sourness: is the maxmimum considered to be something from an intentionally soured ale, or something like straight lemon juice? There is no perfect answer to this, and chances are that some flavor attributes will lend themselves better to the “in beer” approach, while others may work better with the “absolute” approach. The only thing I can suggest is that you open these discussions with your panelists (with appropriate standards in front of you), find some consensus that everyone understands and is willing to follow, and strive for a consistent application of your decision.
As I eluded to previously, each person has a unique response to a given flavor, and the differences between people can make life difficult for taste panel leaders. If you can find and train enough panelists (10 or more) then the variation between panelists is probably going to be fine and won’t affect the data too much if people come and go. The data you generate in profiling is an average of all your panelist’s scores, so as long as each of your panelists is consistent in their use of the intensity scale then individual variations will just become part of your data and won’t cause any discernable problems. Issues arise, however, when your panelists are not consistent in their ratings or when you have a small panel with poor attendance levels.
Part v: Lexicon development
The first thing to do when arriving at this next level is to pick one or two brands to focus on first, usually your high-volume flagship brands. Once you’ve decided which brands to look at first, put several samples of them in front of your panel and through intensive discussions you’ll build a lexicon of attributes that apply to those products. You should ideally have several flavor standards present so that the panelists can reference them in their discussions. Write down each of the suggested attributes on the whiteboard and when the list is finished you’ll need to help the panel decide which of them are the most appropriate flavor attributes to describe the flavor of your brands. Trim down the list to something like 4-6 “Core” attributes, which are the “marquee” flavors of the brand. Additionally, you should develop a list of attributes that are applicable across all your brands; I call these “Standard” attributes, and they are usually things like astringency, carbonation, sweetness, sourness, turbidity, and bitterness. Lastly, develop a list of potential “Negative” attributes that can potentially arise in your products. These could be flavors like isovaleric, papery/oxidized, lightstruck, acetic, metallic, etc. Your particular brand mix and the styles of beers you specialize in will dictate what the Core, Standard, and Negative attributes will be. For example, a Core attribute for bottled Corona should include lightstruck, while that attribute would be in the Negative category for a Sierra Nevada Pale Ale. Choose what’s appropriate for your production environment.
The next article will cover flavor profiling, reporting, and data management. See you then!