您还没有绑定微信,更多功能请点击绑定

[校对]第六篇——Analyzing Experiments with Ordered Categorical Data

本帖最后由 小编H 于 2011-7-13 09:55 编辑

请对以下文章有校对兴趣的组员留下你的预计完成时间,并发短信息联系小编H,以便小编登记校对者信息以及文章最终完成时的奖惩工作。
PS:请把您的邮箱地址通过短信息发给小编,原文由大量图片,以便发送文档校对~~


Analyzing Experiments(试验) with Ordered Categorical Data
(对规则离散统计数据进行试验分析)
Six Sigma projects(项目) often deal with experiments whose outcomes are ordered categorical data, rather than continuous. It is important to know the right analysis methods for these cases, such as Jeng and Guo’s weighted probability-scoring scheme (WPSS).
By Liem Ferryanto
6西格玛方案经常处理一些试验,这些试验的结果都是规则离散分布的,而不是连续的。对于这种情况,使用正确的分析方法是非常重要的,例如郑和郭的权重概率得分表法(WPSS)。

Six Sigma projects in various industries often deal with experiments whose outcomes are not continuous variable data, but ordered categorical data. Analysis of variables (ANOVA) is a technique used to analyze continuously experimental data, but is not adequate for analyzing categorical experimental outcomes. Fortunately, many other methods have been developed to deal with categorical experiments, such as Jeng and Guo’s weighted probability-scoring scheme (WPSS).
很多制造业的6西格玛方案经常处理一些结果是规则离散而非连续变量数据的试验。变量分析法,过去常用来分析连续的实验数据,但并不适合用来分析离散的试验结果。很幸运,也产生了很多其他的方法来处理这些离散试验结果,比如郑和郭的权重概率得分表法(WPSS)。
The WPSS technique is interpretable and easy to implement in a spreadsheet software program. The following case study, which involves medical devices, serves as an example of how a modified WPSS technique can be used to analyze experiments with ordered categorical data.
在软件电子程序中WPSS技术可以很容易进行演示和说明。下面举个医疗设备的誓言。。。实例,来说明运用WPSS是如何处理规则离散实验数据的。
Determining the Best Factors确定最重要的因素
This study explores the influence of contact lens design factors on outcomes related to ease of lens insertion, meaning how easy it is to put patients’ contact lenses in their eyes. Soft contact lenses are thin pieces of plastic or glass that float on the tear film on the surface of the cornea. They are shaped to fit the user's eye and are used to correct refractive errors such as nearsightedness, farsightedness and unequal curvature of the cornea (astigmatism). For this example, only three lens design factors of a certain lens type with fixed material properties are considered: lens thickness profile (3 levels), base curve dimension (3 levels) and base curve profile (2 levels). Determining the ease of insertion is a five-step process.
这项实验研究的是隐形眼镜设计因素对戴入隐形眼镜的舒适度有什么影响,就是怎样才能让病人戴上隐形眼镜更舒服。柔软的隐形眼镜都是片薄薄的塑料或玻璃,覆盖在角膜表面的泪膜上。它们被做成适合用户眼镜的形状,用来矫正近视眼、远视眼、散光眼的折射角度。比如对于一种具有特定材料属性的隐形眼镜我们考虑三方面因素:隐性眼镜的剖面厚度(3级)、基线尺寸(3级)、基线轮廓(2级)。确定带隐形眼镜的舒适度分为5个步骤。
Step 1: Design an Experiment
第1步:设计实验
Because this is an exploratory experiment, an L9 orthogonal matrix is used. The design matrix with the three lens design factors is shown in Table 1.
因为这是个探索实验,我们采用L9正交矩阵表。表1中的矩阵表列出了隐形眼镜的三项设计因素
Table 1: L9 Orthogonal Matrix of Three Lens Design Factors表1:三设计因素矩阵表
Design Factors设计因素
Experiment Number实验序号 Thickness profile隐性眼镜的剖面厚度 Base curve dimension基线尺寸 Base curve profile基线轮廓
1 1 1 1
2 1 2 2
3 1 3 1
4 2 1 2
5 2 2 1
6 2 3 1
7 3 1 1
8 3 2 1
9 3 3 2

Step 2: Plan Number of Samples and Data Categorization
第2步:对采样和数据种类进行编号
In small clinical trials, nine trained contact lens wearers are asked to try each of the nine lens designs from the L9 matrix and give their opinion on the ease of insertion. Each time a patient inserts a lens in their eye, they are asked to rate how easy it was to do. Their responses are integer numbers from 1 to 10, with the worst condition rated 1 (the patient cannot insert the lens) to the best condition rated 10 (the patient needs only one trial and the lens immediately sits on the right location of the eye). The ratings are grouped into four categories of ease of insertion:
在小的临床试验中,9个经过培训的隐形眼镜佩戴者被要求试戴L9矩阵表中1-9种隐形眼镜,并对戴上眼镜的舒适度进行评价。每次他们戴上眼镜后,他们必须回答眼镜的舒适率,他们的回答都是从1到10的整数,其中1是最坏的情况(眼镜戴不到眼睛里),10是最好的情况(只需1次就可以戴到眼睛里,眼镜也很快调整到合适的位置)。舒适率又被分为4类:
 Category I (very easy to insert): Ratings 9 – 10
第I类(非常容易戴):9-10级
 Category II (easy to insert): Ratings 7 – 8
第II类(容易戴):7-8级
 Category III (moderate to insert): Ratings 5 – 6
第III类(不太容易戴):5-6级
 Category IV (difficult to insert): Ratings 1- 4
第IV类(难戴):1-4级
The design matrix with the outcomes for each run is shown in Table 2.
每轮的结果填写在矩阵表2中。
Table 2: Insertion Ratings Grouped By Category
Design Factors Number of Observation By Category观察数据类别
Experiment Number轮数 Thickness profile隐性眼镜的剖面厚度 Base curve dimension基线尺寸 Base curve profile基线轮廓 I II III IV Total
1 1 1 1 1 2 5 1 9
2 1 2 2 3 3 3 0 9
3 1 3 1 4 2 2 1 9
4 2 1 2 2 2 3 2 9
5 2 2 1 4 4 1 0 9
6 2 3 1 1 3 1 4 9
7 3 1 1 5 3 1 0 9
8 3 2 1 2 5 1 1 9
9 3 3 2 4 1 4 0 9

Step 3: Calculate Probability of the Outcomes Per Category and Run
第3步:对每类、每轮结果计算概率
In order to estimate the location and dispersion effects of each run, the scores of each category of each run must be transformed into probability values. Let i be an experiment run, for i = 1, 2,…I (in this example, I = 9) and j be a category of experimental outcomes, for j = I, II,…J (in this example J = IV). Then it is possible to calculate the probability (proportion) that an outcome is placed in j-th category of i-th run, i.e. pij, as the following:

pij = nij/si
为了评估每轮的位置和分散效应,每类、每轮的分数必须转化成概率值。用i代表一轮,如i=1,2,…I(在此例中,I=9),用j代表试验结果的类别,如j= I, II,…J(此例中J= IV)。然后我们就可以计算第i轮在第j类中的概率pij了,如下:
pij = nij/si

where nij is the number of outcomes in j-th category of i-th run and si is the total outcomes of all categories in the i-th run.
此处,nij是第i轮在第j类中的结果数,si是第i轮在所有类别中的总结果。
For example, the probability of an outcome being placed in the III-th category of the 1st run is p1III = n1III/s1 = 5/9 = 0.56. The probability of the outcome in each category of each run is shown in Table 3.
例如,第1轮在第III类中的概率是p1III = n1III/s1 = 5/9 = 0.56.在每一轮中每一类的概率在表3中
Table 3: Probability of Outcomes
Number of Observation
By Categories观察类别数 Probabilities for Each Category每一类概率
Experiment Number轮数 I II III IV Total (I) (II) (III) (IV)
1 1 2 5 1 9 0.11 0.22 0.56 0.11
2 3 3 3 0 9 0.33 0.33 0.33 0.00
3 4 2 2 1 9 0.44 0.22 0.22 0.11
4 2 2 3 2 9 0.22 0.22 0.33 0.22
5 4 4 1 0 9 0.44 0.44 0.11 0.00
6 1 3 1 4 9 0.11 0.33 0.11 0.44
7 5 3 1 0 9 0.56 0.33 0.11 0.00
8 2 5 1 1 9 0.22 0.56 0.11 0.11
9 4 1 4 0 9 0.44 0.11 0.44 0.00

Step 4: Estimate Location and Dispersion Effects of Each Run
第4步:评估每一轮的位置和离散效应
Given each category j has a weight wj, which is the upper limit of the j-th category rate, the location scores Wi for the i-th run is defined by
假如每类j的权重是wj,这是第j类比率的上限,第i轮的位置分Wi定义为

The rationale for using the upper limit of the category rate is that the weight should reflect the rating values. The dispersion score di2 is defined by
使用类别比率的最高限的原因是权重应反应比率值。离散分di2(di的平方)定义为

where the target values are defined as {The upper limit of the I-st category rate, 0, 0, …, 0} for categories {I, II, III, … ,J}, respectively.
此处类别{I, II, III, … ,J}分别对应的目标值为{第一类比率的上限, 0, 0, …, 0}
The rationale of setting the target values is that only outcomes that fall in the best category are rewarded. For example, the location and dispersion scores for the 1st run are W1 = 10*0.11 + 8*0.22 + 6*0.56 + 4*0.11 = 6.7 and d12 = 2 + 2 + 2+ 2 = 93.48. The location and dispersion scores of the outcomes of each run are shown in Table 4.
设立目标值的原因是只有处于最好的类别中的结果才应该奖励。例如第1轮位置和离散分数为W1 = 10*0.11 + 8*0.22 + 6*0.56 + 4*0.11 = 6.7 and d12 = 2 + 2 + 2+ 2 = 93.48.每轮结果的位置和离散分数在表4中
Table 4: Location, Dispersion and Mean Square Deviation Scores
Experiment Number Design Factor - Thickness Profile Design Factor - Base Curve Dimension Design Factor - Base Curve Profile Location Scores (Wi)位置分 Dispersion Scores (di2)离散分 MSD
1 1 1 1 6.7 93.5 0.16
2 1 2 2 8.0 55.6 0.06
3 1 3 1 8.0 36.0 0.04
4 2 1 2 6.9 68.4 0.11
5 2 2 1 8.7 44.0 0.04
6 2 3 1 6.2 89.7 0.21
7 3 1 1 8.9 27.3 0.03
8 3 2 1 7.8 80.9 0.08
9 3 3 2 8.0 38.8 0.04
One performance measure to combine location and dispersion effects is mean square deviation (MSD), which allows practitioners to make judgments in one step. If any outcome is the larger-the-better characteristic, then its expected MSD can be approximately expressed in terms of location and dispersion effects as follows:
整合位置和离散效应的指标测量法就是MSD,它让工作者做出一步到位的决定。如果任何结果具有越大越好的特点,那它的预期MSD与位置、离散之间的关系大体可表达为:

For example, the expected MSD for 1st run is E1 = 1/(6.67)2 (1+ (3*93.5)/(6.67)2) = 0.16. The MSD scores for all runs are given in Table 4.
例如,第1轮的预期MSD为E1 = 1/(6.67)2 (1+ (3*93.5)/(6.67)2) = 0.16.每轮的MSD分数已显示在表4中。
The location, dispersion and expected MSD effects for each design factors are shown as Tmax-Tmin (Figures 1, 2, 3). Higher Tmax-Tmin values or steeper main effects curves indicate a stronger influence of that design factor on the outcomes.
位置、离散、预期MSD对每个设计因素的影响就像Tmax-Tmin显示的一样(图表1,2,3)。如果Tmax-Tmin值越高或影响曲线越陡峭,说明此设计因素对结果的影响越大。
Figure 1: Effects and Optimal Solutions for Location Scores
位置分数的影响和最优解决方案

Design Factors设计因素
Factor Levels Thickness profile Base curve dimension Base curve profile
1 7.6 7.5 7.7
2 7.3 8.1 7.6
3 8.2 7.4 Not available
Tmax - Tmin 1.0 0.7 0.1
Optimal Level 3 Level 2 Level 1



Figure 2: Effects and Optimal Solutions for Dispersion Scores
离散分数的影响和最优解决方案


Design Factors
Factor Levels Thickness profile Base curve dimension Base curve profile
1 61.7 63.1 61.9
2 67.4 60.1 54.3
3 49.0 54.8 Not available
Tmax - Tmin 18.4 8.2 7.6
Optimal Level 3 Level 3 Level 2



Figure 3: Effects and Optimal Solutions for MSD Scores
MSD分数的影响和最优解决方案


Design Factors
Factor Levels Thickness profile Base curve dimension Base curve profile
1 0.09 0.10 0.09
2 0.12 0.06 0.07
3 0.05 0.10 Not available
Tmax - Tmin 0.07 0.04 0.02
Optimal Level 3 Level 2 Level 2



Step 5: Determ ine Optimal Solutions
第5步:确定最优解决方案
The level of a particular design factor with the highest location value, the lowest dispersion value or the lowest expected MSD value is the optimal solution for each of those factors, respectively. The optimal solution based on the expected MSD criteria(标准) is the optimal trade-off between maximal location and minimal dispersion scores.
具有最高位置值、最低离散值或最低预期MSD值的某一设计因素等级就是针对此设计因素的最优解决方案。基于预期MSD标准的最优解决方案是最好的平衡最高位置值和最低离散值。
The predicted optimal solution based on the expected MSD criteria is thickness profile at level 3, base curve dimension at level 2 and base curve profile at level 2. But if practitioners know there are interaction effects among design factors, they cannot depend solely on the main effect values or plots to choose the settings of design factors. The interaction plot(相互影响的部分) for the expected MSD effects(预期MSD的影响) shows that thickness profile heavily interacts with base curve level/dimension (Figure 4). A small interaction also exists between base curve dimension and base curve profile. After taking interaction effects into consideration, practitioners need to examine whether the chosen optimal design factor levels still give optimal effects to the experiment outcomes.
基于预期MSD标准的预期最优解决方案是3级的隐性眼镜的剖面厚度、2级的基线尺寸、2级的基线轮廓。但如果工作人员知道各设计因素之间是相互影响的,他们就不会只考虑关键影响值或关键影响点来选择设计因素的设置。能够影响预期MSD值的相互影响关系是:隐性眼镜的剖面厚度对基线尺寸影响非常大,基线尺寸和基线轮廓之间的影响比较小(图表4)。把这些影响考虑进来以后,工作人员需要检测我们选择的最优设计因素等级是否仍然产生最优的试验结果。
Figure 4: Interaction Plot of Thickness Profile, Base Curve Level/Dimension
and Base Curve Profile



In this case, thickness profile at level 3 gives almost consistently the lowest MSD scores for different levels of base curve dimension and also consistently gives the lowest MSD scores for different levels of base curve profile. Thus, it gives the optimal effect to the experiment outcomes. Base curve dimension at level 2 almost consistently gives the lowest MSD scores for different levels of thickness profile and also consistently gives the lowest MSD score for different levels of base curve profile. Thus, it too gives the optimal effect to the experiment outcomes. The Tmax-Tmin value of the base curve profile is the lowest and its curve is flat. Thus, base curve profile has insignificant influence on the outcomes, and can be set at either level 1 or 2. Therefore, the expected MSD predicts that lens design with thickness profile at level 3, base curve dimension at level 2 and base curve profile at either level 1 or 2 would give the optimal ease of insertion.
如果这样,3级隐性眼镜的剖面厚度对于不同的基线尺寸和基线轮廓均对应最低的MSD值,因此,3级隐性眼镜的剖面厚度是最优选择。2级基线尺寸对于不同的隐性眼镜的剖面厚度和基线轮廓均对应最低的MSD值,因此,2级基线尺寸也是最优选择。基线轮廓的(Tmax-Tmin)值是最低的,并且它的曲线是平的,因此,基线轮廓对实验结果的影响非常小,可以选择第1或2等级。因此预期MSD显示3级隐性眼镜的剖面厚度、2级基线尺寸、1或2及基线轮廓的隐形眼镜的舒适度最好。

Easy to Implement Optimization Method
轻松执行最优方法
A modified(标准的) WPSS is a simple and straightforward method for dealing with ordered categorical data. This case study shows that a single performance measure MSD derived from WPSS can provide insight to(直观) a system through experiments and can direct practitioners to the optimal solution.
修正后的WPSS可以非常简单直观的处理规则分散数据。这个试验实例说明:运用WPSS法来获取某个工作指标的MSD可以通过试验建立一个直观的系统并且知道工作者得到最优解决方案。
About the Author: Liem Ferryanto, Ph.D., is project director and Six Sigma Champion of global research, development and engineering at CIBA Vision Corp., a Novartis company, in Duluth, Ga., USA. He can be reached at lferryanto@gmail.com
作者简介:Liem Ferryanto博士,是CIBA视觉公司(位于美国佐治亚州的德鲁斯)技术部门的工程总监,是6西格玛全球探索的拥护者。你可以通过邮箱lferryanto@gmail.com和他联系。
对“好”的回答一定要点个"赞",回答者需要你的鼓励!
已邀请:

ytlsguoxia (威望:0) (山东 烟台) 汽车制造相关 经理

赞同来自:

好的
大家消息都挺快啊

12 个回复,游客无法查看回复,更多功能请登录注册

发起人

扫一扫微信订阅<6SQ每周精选>