How to measure anything

21.08.202221.08.2022 admin 0 Comments how, how to made, как, как сделать, как создать, совими руками

How to measure anything

Правовую поддержку обеспечивает юридическая фирма AllMediaLaw

Современного человека окружает огромное и щедрое информационное поле. Однако когда мы сталкиваемся с какими-то реальными проблемами, завязанными на необходимость «узнать то-то» и «измерить то-то», то регулярно оказывается, что мы либо пасуем перед кажущимися трудностями и ведем себя так, словно подобной информации не существует, либо же решаем прикинуть «на глазок».

При этом мы даже не можем представить себе, сколько на этом теряем денег, времени и ресурсов, – ибо чтобы узнать это, нам надо было измерить то, от измерения чего мы как раз и отказались! Причем эта проблема существует на всех уровнях – от мелкого частного предприятия до самых крупных государственных структур.

Правда состоит в том, что любая задача по измерению, какой бы сложной, запутанной или плохо сформулированной она ни была, поддается решению теми или иными методами.

Более того: даже если нельзя (или бессмысленно) пересчитать некие объекты, финансы или, допустим, симпатии потребителей, сведя результат к единственному конкретному числу, можно как минимум уменьшить интервал разброса – получив, таким образом, гораздо больше определенности в том вопросе, от которого зависит грамотное принятие решения.

А еще одна сторона правды состоит в том, что вы на самом деле знаете куда больше, чем вам самим кажется. Просто надо понимать, как именно можно применить эти знания.

О том, как узнать неведомое прежде и стать куда более квалифицированным экспертом в оценке чего бы то ни было, и рассказывает эта книга.

1. Измерение: решение существует

1.1. Измерить можно все, что угодно, – при условии, что измеряемый объект, фактор или явление вообще существует. Данные измерения можно произвести экономически обоснованными способами. Даже если такие измерения будут приблизительным, они все равно дают больше информации, чем вы знали про этот объект или явление до сих пор, – а значит, они могут иметь смысл.

1.2. Есть два основных толкования слова «нематериальное», и их не надо смешивать. Если речь идет о вещах, которые не являющиеся телесными, осязаемыми, то они, конечно же, существуют. Если же слово «нематериальный» употребляется в значении «не поддающийся никакому измерению», то это неверное толкование.

Примеры нематериальных (в первом значении) вещей: время; бюджет; право собственности на патент; «гибкость», необходимая в создании новых продуктов; риск неудачи при реализации проекта; эффект, который новая политика государства оказывает на здоровье населения; эффективность научных исследований; стоимость информации; вероятность того, что та или иная политическая партия победит в борьбе за Белый дом; качество; мнение общественности и т. д.

1.3. Многие, считая, что «нематериальное» не поддается измерению, принимают невыгодные для себя решения. Многие важные факторы при оценке не учитываются из-за того, что люди не понимают, как эту потенциальную выгоду (или же потенциальный убыток) подсчитать: подобный расчет считается невозможным. Верх берут слабые, но более очевидные в оценке предложения.

Конец ознакомительного фрагмента.

Текст предоставлен ООО «ЛитРес».

Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.

Безопасно оплатить книгу можно банковской картой Visa, MasterCard, Maestro, со счета мобильного телефона, с платежного терминала, в салоне МТС или Связной, через PayPal, WebMoney, Яндекс.Деньги, QIWI Кошелек, бонусными картами или другим удобным Вам способом.

Книга Дугласа Хаббарда «Как измерить всё, что угодно»: часть 1

С вами рубрика «что я подчеркнул, пока читал книгу», также известная как выписка. Целиком книга называется «Как измерить всё, что угодно. Оценка стоимости нематериального в бизнесе». Как положено такой бизнес-литературе, очень много воды и повторов, но всё равно интересно.

Глава 1. Нематериальное и проблема его измерения.

Измерить можно всё, что угодно ‹. › Каким бы приблизительным ни было это измерение, оно все равно будет им, если расскажет вам больше, чем вы знали до сих пор. А то, что чаще всего считается не поддающимся измерению, практически всегда можно оценить сравнительно простым способом.

Глава 2. Интуитивное умение измерять всё: Эратосфен, Энрико и Эмили

Какие переменные характеризовались наибольшей неопределённостью — процент семей, регулярно пользующихся услугами настройщиков пианино, частота настроек, число инструментов, которые можно настроить за день, или что-то ещё? Самый крупный источник неопределённости указывал на то, какие измерения позволят максимально снизить её.

Врачи, не видя девочку за экраном, должны были определить, над какой их рукой — правой или левой — она держит свою ладонь, только на основании собственных ощущений от её энергетического поля. О результатах Эмили доложила на ярмарке научных идей и получила в награду голубую ленточку — впрочем, как и все остальные участники.

Рэнди не нравится, когда говорят, что он занимается «развенчанием» притязаний на паранормальные способности, поскольку он просто тестирует эти заявления объективными научными методами. Но поскольку сотни претендентов на миллион так и не сумели получить приз, не пройдя простейшие научные испытания, до сих пор подобные утверждения только опровергались.

Если MII действительно улучшает качество предоставляемых услуг, то оно должно влиять на восприятие этих услуг клиентами и, в конечном счёте, на доходы. Просто попросите случайную выборку клиентов проранжировать качество каких-либо услуг до и после создания MII (так, чтобы они не знали, за какой период оценивают данный параметр) и узнайте, заставило ли их повысившееся качество приобретать у Mitre больше услуг.

Если качество и инновационность действительно возросли, то разве эта разница не должна, по крайней мере, чувствоваться? ‹. › Если такие преимущества MII, как качество, инновационность и любые другие, невозможно обнаружить, то они не имеют значения.

Между тем, даже небольшое снижение неопределённости может принести миллионы в зависимости от важности решения, принятию которого оно способствует, и от частоты принятия подобных решений.

Обычно вещи, считающиеся в бизнесе неизмеряемыми, можно количественно определить с помощью простейших приёмов наблюдения, как только люди поймут, что неизмеримость — всего лишь иллюзия.

Глава 3. Почему неизмеримость нематериального — всего лишь иллюзия

Наряду с сомнениями в возможности проведения измерения существует убеждение, что порой проводить количественную оценку и не следует. Выдвигаются ‹. › возражения против полезности и значимости статистики вообще (когда, например, утверждают, что «с помощью статистики можно доказать всё, что угодно»); морально-этические возражения (утверждение, что оценивать некоторые вещи просто аморально).

Измерение — это совокупность снижающих неопределённость наблюдений, результат которых выражается некой величиной.

Факт присутствия ошибки, избежать которой полностью не удастся, при том что полученный результат всё равно станет шагом вперёд по сравнению с прежними представлениями, — ключевая идея проведения экспериментов, опросов и прочих научных измерений.

Представление об информации как о снижении неопределённости имеет огромное значение для бизнеса. Ведь многие решения (например, стоит ли внедрять новую информационную технологию или разрабатывать новый продукт) принимаются компаниями в условиях неопределённости, и даже незначительное её уменьшение способствует более удачному выбору.

На вопрос, как измерить стратегическую согласованность, гибкость или удовлетворённость потребителей, я отвечаю: «А что конкретно вы имеете в виду?» Интересно наблюдать, как зачастую, уточняя используемый ими термин, люди фактически сами отвечают на свой вопрос.

Сначала мы осознаем, что если объект X имеет для нас значение, то он, по определению, должен в чем-то проявляться. Разве могли бы такие реальности, как качество, риск, безопасность или репутация, иметь для нас какую-то ценность, не проявляй они себя прямо или косвенно?

Менеджеры компаний должны понять, что некоторые вещи кажутся нематериальными только потому, что люди сами толком не решили, о чём они говорят.

Вызовите этих людей и спросите, сколько времени они обычно тратят на дорогу. Предположим, будут получены следующие ответы: 30, 60, 45, 80 и 60 минут. Возьмём самое высокое и самое низкое значения в выборке — 35 и 80. Вероятность того, что медиана значений продолжительности поездок на работу и домой для совокупности работников находится в этом интервале значений, составляет 93 %.

Если выбрать наугад такие пять значений, которые все располагались бы выше или ниже медианы, то медиана оказалась бы вне интервала. Но какова вероятность подобного выбора?

Взрыв мозга и качество редактуры:

Но скажу, что если это только допущения, то они контрпродуктивны. Давайте воспользуемся другими предпосылками, которые, как и любые другие, не всегда оказываются верными в конкретных случаях, но на практике приносят намного больше пользы.

‹. › Специалисты любого профиля склонны считать свою сферу деятельности уникальной с точки зрения уровня неопределенности. Обычно они говорят: «В отличие от других отраслей в нашей каждая задача уникальна и непредсказуема» или «В моей отрасли факторов слишком много, чтобы их можно было выразить количественно» и т. п. Мне доводилось много работать в разных отраслях и слышать одно и то же. Но до сих пор проблемы измерения везде оказывались стандартными и ничем не отличались друг от друга.

Кливлендский оркестр захотел оценить, улучшается ли со временем качество его исполнения произведений ‹. › подошёл к этой проблеме творчески и начал подсчитывать, сколько раз публика устраивала овации стоя.

Что вы собираетесь делать — публиковаться в научном журнале или сокращать неопределённость при принятии реального бизнес-решения? Отнеситесь к измерению как к итеративному процессу. Начните измерять то, что вам нужно.

«Все события равновероятны, так как мы не знаем, что произойдёт» (сказано слушателем моего семинара).

Дело в том, что когда люди говорят: «С помощью статистики можно доказать всё, что угодно», они, скорее всего, имеют в виду не статистику как таковую, а использование цифр вообще (особенно, по какой-то причине, процентов). На самом деле они подразумевают не совсем «что угодно» или не совсем «доказать». Реальный смысл поговорки заключается в том, что «цифрами можно сбить с толку людей, особенно легковерных, кто не в ладах с математикой». Вот с этим я полностью согласен.

Ясно, что вероятности мы используем исключительно потому, что не можем быть уверены в полученном результате.

Тот, кто говорит, что никогда не рискует, тем не менее, летит в Москву самолётом Аэрофлота (компании, статистика авиапроисшествий которой хуже любого американского перевозчика), чтобы получить в качестве приза миллион долларов.

Когда между полным невежеством и возможностью хоть немного восполнить пробелы в своих знаниях выбирают полное невежество, вряд ли это можно назвать высокими моральными устоями.

Глава 4. Формулирование задачи по измерению

Например, вы захотели оценить качество продукта. Тогда придется выяснить, на какие факторы повлияет результат оценки, и ответить на более общий вопрос о том, что вообще подразумевается под качеством продукта. Вы хотите использовать полученную информацию для решения об изменении действующего производственного процесса? Если да, то насколько низким должно оказаться качество продукта, чтобы это решение было принято? Вам нужны данные о качестве, чтобы рассчитать премии менеджеров по программе качества? Если да, то по какой формуле будут рассчитываться эти премии? И конечно, прежде всего ответы на вопросы зависят от того, какой смысл вы вкладываете в понятие «качество продукта».

Многие государственные служащие представляют себе бизнес как некий сказочный мир высокой эффективности и мотивации, где страх проиграть в конкурентной борьбе заставляет людей трудиться изо всех сил. Как часто можно услышать от них сожаления, что они не работают в бизнесе.

Ранее в Управлении применялся совершенно иной подход к измерению безопасности. Использовались такие показатели, как число сотрудников, закончивших курсы подготовки, и число компьютеров, на которые были установлены определённые программы. Иными словами, результаты вообще не измерялись. Все предыдущие усилия были направлены на количественную оценку того, что было легче измерить.

Обучение количественному выражению приблизительности своих знаний о неизвестном показателе — важный этап в выборе такого способа его оценки, который отвечал бы вашим потребностям.

Глава 5. Калиброванные оценки: что вам известно уже сейчас?

Даже не имея точных ответов на подобные вопросы, вы всё-таки что-то знаете. Например, одни значения интересующего вас показателя кажутся более вероятными, чем другие.

Сначала проведите небольшой эксперимент: убедитесь, что указанные вами доверительные интервалы — действительно 90-процентные. Возьмите один из вопросов с таким доверительным интервалом, скажем, когда Ньютон опубликовал свою работу о всемирном законе тяготения? Допустим, я дал вам шанс выиграть 1000 дол. одним из двух следующих способов: 1) вы получите 1000 дол., если год публикации книги Ньютона окажется в пределах между верхней и нижней границами указанного вами интервала. Если границы выбраны неверно, то не выиграете ничего; 2) вы должны раскрутить круглый диск, разделённый на два неравных сектора, площади которых составляют 90 и 10% соответственно. Если указатель остановится на большем секторе диска, то вы выиграете 1000 дол., если же на меньшем, то не выиграете ничего (то есть вероятность выигрыша — 90%). Какой из вариантов вы выберете? ‹. › Если вы такой же, как большинство (примерно 80%) людей, то предпочтёте вращать диск. Но почему? Единственным объяснением может служить ваша убежденность в том, что шансов выиграть, раскручивая диск, гораздо больше. Отсюда вывод: доверительный интервал с вероятностью 90% на самом деле — неправильная оценка, сделанная вами. Эта вероятность, скорее, составляет 80, 65, а то и 50%. С точки зрения статистики это называется чрезмерной уверенностью.

Исследования показывают: даже когда люди только воображают, что рискуют деньгами, это значительно улучшает их способность оценивать шансы.

Тех, кто точно определяет степень своей уверенности (то есть тех, кто оказывается прав в 80% случаев, когда говорит, что уверен на 80%), называют калиброванными специалистами.

Доверительный интервал 90% означает, что есть вероятность 5% того, что истинное значение окажется выше верхней границы предлагаемого диапазона, и такая же вероятность того, что истинное значение окажется ниже нижней границы данного диапазона. Это означает, что эксперт должен быть на 95% уверен, что истинное значение оцениваемой величины будет ниже верхней границы интервала.

До тех пор, пока человек считает, что субъективная вероятность в чём-то уступает объективной, он не может откалибровать свои оценки.

Допущение предполагает, что в целях доказательства мы считаем некий факт верным независимо от того, так ли это на самом деле.

Глава 6. Оценка риска: введение в моделирование методом Монте-Карло

Мы установили различие между неопределённостью и риском ‹. › Риск — это просто состояние неопределённости, которое влечёт за собой возможный ущерб любого рода.

Обнаруживая, что дело обстоит именно так, я иногда спрашиваю, насколько «средним» является данный риск. Вероятность 5% понести убытки, превышающие 5 млн дол., — это какой риск: низкий, средний или высокий? Никто не знает. Что лучше — среднерисковые инвестиции с доходностью 15% или высокорисковый вклад с доходностью 50%? Опять никто не знает.

С годами я обнаружил, что если организация и применяет количественный анализ рисков, то обычно это делается для принятия повседневных оперативных решений. Самые серьёзные и опасные решения чаще всего принимаются без предварительного анализа рисков, связанных с ними, — по крайней мере, таких исследований, с которыми согласился бы актуарий или статистик, не проводится. Я назвал этот феномен «парадоксом риска».

Глава 7. Оценка стоимости информации

Цена ошибки — это разница между сделанным вами неправильным выбором и лучшей из имевшихся альтернатив, то есть той, на которой вы остановились бы, обладая полной информацией по вопросу.

стоимость частичного снижения неопределенности

Если мы сумеем осуществить измерение, которое, по крайней мере, позволит поднять нижнюю границу до уровня, превышающего порог в 200 тыс. проданных единиц продукции, то возможность убытков будет устранена ‹. › Разность между стоимостью информации, снижающей неопределённость вдвое и снижающей её на три четверти, может оказаться весьма небольшой.

‹. › Если стоимость информации равна нулю, затраты на любое измерение будут чрезмерными.

Снова и снова я убеждался: люди тратят массу времени, сил и денег на измерение того, что не имеет большой информационной стоимости, и игнорируют величины, действительно важные для принятия решений.

Кроме того, менеджеры любят осуществлять такие измерения, результаты которых могут их порадовать. Ну зачем, в самом деле, оценивать прибыль, если вы подозреваете, что она окажется нулевой? Конечно, в этом случае менеджеры рассуждают, как люди, которые просят деньги или создают видимость работы, а не как руководители, которые подписывают чеки.

Не стоит планировать масштабные исследования, если требуется измерить то, о чём сейчас вы практически ничего не знаете. Оцените хотя бы что-то, устраните хоть какую-то неопределённость и проанализируйте то, что вы узнали.

How to Measure Anything. Finding the Value of Intangibles in Business

Скачать книгу

О книге «How to Measure Anything. Finding the Value of Intangibles in Business»

Now updated with new measurement methods and new examples, How to Measure Anything shows managers how to inform themselves in order to make less risky, more profitable business decisions This insightful and eloquent book will show you how to measure those things in your own business, government agency or other organization that, until now, you may have considered «immeasurable,» including customer satisfaction, organizational flexibility, technology risk, and technology ROI. Adds new measurement methods, showing how they can be applied to a variety of areas such as risk management and customer satisfaction Simplifies overall content while still making the more technical applications available to those readers who want to dig deeper Continues to boldly assert that any perception of «immeasurability» is based on certain popular misconceptions about measurement and measurement methods Shows the common reasoning for calling something immeasurable, and sets out to correct those ideas Offers practical methods for measuring a variety of «intangibles» Provides an online database (www.howtomeasureanything.com) of downloadable, practical examples worked out in detailed spreadsheets Written by recognized expert Douglas Hubbard—creator of Applied Information Economics—How to Measure Anything, Third Edition illustrates how the author has used his approach across various industries and how any problem, no matter how difficult, ill defined, or uncertain can lend itself to measurement using proven methods.

На нашем сайте можно скачать книгу «How to Measure Anything. Finding the Value of Intangibles in Business» в формате pdf или читать онлайн. Здесь так же можно перед прочтением обратиться к отзывам читателей, уже знакомых с книгой, и узнать их мнение. В интернет-магазине нашего партнера вы можете купить и прочитать книгу в бумажном варианте.

Book Notes by Abi Noda

ISBN: 978-1118539279
READ: June 14, 2020
ENJOYABLE: 7/10
INSIGHTFUL: 10/10
ACTIONABLE: 10/10

Anything that can be observed can be measured. Things like «employee empowerment», or «creativeity» must have observable consequences if they matter at all.

If you know almost nothing, almost anything will tell you something.

In many companies, intangibles (ie. «flexibility to create new products») are assumed to be immeasurable so decisions are not well informed. Companies skip over investments where benefits are «soft» and considered immeasurable, such as «premium brand positioning» or «improved word of mouth advertising» in favor of minor cost saving ideas because they are easy to measure. Conversely, key strategic principles or core values of companies (ie. «improving customer relationships») are treated as a «must have» so they get investment regardless of the degree to which those investments had measurable effectiveness.

If the outcome of a decision is highly uncertain and has significant consequences, then measurements that reduce uncertainty about it have a high value.

ie. «Where should investments be made to improve developer productivity?»

Illusion of intangibles

Three reasons people think something can’t be measured:

Definition of measurement = a quantitatively expressed reduction of uncertainty based on one or more observations.

Measurement does not need to be infinitely preciese. In fact, lack of reported error means it’s not really a proper measurement at all.

If someone asks how to measure «strategic alignment», ask «what do you mean?» Once you figured out what something is, it becomes a lot easier to measure.

If it matters at all, it is detectable/observable. If it is detectable, it can be detected as an amount or range of possible amounts. If it can be detected as a range of possible amounts, it can be measured.

If you can identify even a single observation that would be different between an organization with greater or lesser «employee empowerment», then you are on your way to measuring it.

Rule of five = 93.75% chance that the median of a population is between the smallest and largest values in any random sample of five from the population

No matter difficult or unique your measurement problem seems to you, assume it has been done already by someone else, perhaps in another field if not your own. I’ve noticed that there is a tendency among professionals in every field to perceive their field as unique in terms of the burden of uncertainty. The conversation generally goes something like this: «Unlike other industries, in our industry every problem is unique and unpredictable,» or «Problems in my field have too many factors to allow for quantification.»

Don’t assume that the only way to reduce uncertainty is to use an impractically sophisticated method. Think of measurement as iterative. Start measuring it. You can always adjust the method based on initial findings. Examples:

School wanted to measure online teacher performance. They defined sampling methods that allowed managers to select recordings of sessions and particular slices of time, each a minute or two long, throughout a recorded session. For those randomly chosen time slices, they could sample what the teacher was saying and what the students were doing.

Cleveland Orchestra wanted to measure whether performances were improving. Many business analysts might propose a repeated randomized patron survey. But the orchestra started counting the number of standing ovations.

Defining what you want to measure

Define the problem. What is your dilemma?

If managers can’t identify a decision that could be affected by a proposed measurement and how it could change those decisions, then the measurement has no value.

Managers say some measurement helps make decisions without specifiying any particular decision. Managers say they need to measure something, without being able to state what they would change if they knew more about that something.

Requirements for a decision:

Measurement of uncertainty = set of probabilities assigned to a set of possibilities, ie. «there is a 60% chance that this market will double in five years»

Doug was tasked with identifying performance metrics for various security-related systems being proposed for the VA. Previous approaches focused on activities like counting the number of people who completed training courses, number of desktops that had certain systems installed. These efforts were focused on what was considered easy to measure, but wasn’t measuring results at all. Doug asked: «What do you mean by IT security? What does improved IT security look like? What would we see or detect that would be different if security were better or worse?» Turned out they wanted to evaluate whether any of the proposed investments were justified. Realized that better security means observing a reduction in the frequency and impact of security incidents.

This is so similar to developer velocity!

Confidence interval = a range that has a particular chance of containing the correct answer, ie. 90% confidence interval is a range that has a 90% chance of containing the correct answer.

Calibration training = helping estimators understand whether they are overconfident or underconfident

Could calibration training be productized?

It is better to be approximately right than to be precisely wrong. –Warren Buffet

Studies have shown that it is quite possible to experience an increase in confidence about decisions and forecasts without actually improving things.

Monte carlo simulation uses a computer to generate a large number of scenarios based on probabilities for inputs.

How to measure

Decomposition = breaking down a measurement problem into constituent parts that can be directly observed

To measure. do forensic analysis of data you already have. follow the trail. Use direct observation, start looking, counting, and/or sampling if possible. If there’s no trail, add a «tracer» so it starts leaving a trail. If you can’t follow a trail at all, create the conditions to observe it (an experiment).

Do a random sample for surveys

Removing names from essay tests graded by teachers removes possible bias a teacher might have about students. In clinical research studies, neither doctors nor patients know who is taking a drug and who is taking a placebo.

Can this be incorporated into dx? Anonymize reviews or code for tagging?

A method to reduce either systemic or random error is a «control».

Random sampling is a control because random effects, while individually unpredictable, follow predictable patterns in the aggregate, ie. you can’t predict a coin flip, but you do know that there’ll be 500+/-26 heads if you flip 1,000 times. Systemic errors are much harder to compute an error range for.

In business, people often choose precision with unknown systemic error over a highly imprecise measurement with random error.

Example: to determine how much time sales reps spend in meetings. time sheets have error, esp those turned in at 5pm on Friday. it is better to directly observe a random sample by checking random reps at random times of day to see if they are in customer calls than to review all timesheets. Random sampling provides a range, but this is preferable. However, if you merely want to measurew the change in time spent, then systemic errors may not be relevant.

Additional types of biases:

Sampling

A test of every single item in a group you want to learn about is a census (ie. monthly inventory, balance sheet). Anything short of a complete sensus of the population is a sample.

It seems remarkable that looking at some things tells us anything about things we aren’t looking at, but this is what most of science does, ie. speed of light was determined with some samples of light, not all light.

Everything we know from «experience» is just a sample. We didn’t experience everything; we experienced some things and we extrapolated from there. This is all we get – fleeting glimpeses of a mostly unobserved world from which we draw conclusions about all the stuff we didn’t see. Yet people are confident in the conclusions they draw from limited samples. because experience tells them sampling works.

Random sampling = each item in the population should have same chance of being selected.

When your current uncertainty is great, even a small sample can produce a big reduction in uncertainty. This book is about things that are considered immeasurable, and in those cases, the initial uncertainty is generally great. And it is exactly in those types of problems where even a few observations can tell us a lot.

Exercise: what is your 90% CI for the weight of the average jelly bean? write it down. Now suppose the weight of bean #1 is 1.4 grams. Then the next sample weights 1.5 grams. Then the next three are 1.4, 1.6. an 1.1.

Shows the power of small samples and ranges.

Whether a finding is stastically significant is not the same thing as whether your current state of uncertainty is less than it was before. People appear to believe that stasticial significant is some standard of legitimacy they should be concerned with.

Asking calibrated estimators for subjective estimates is very useful and has some advantages over traditional statistics.

«z-score»/»normal statistic» was developed to estimate CI based on random sample of 30+. «t-statistic» allows for smaller samples. Curve becomes similiar to z-score once sample is 30+.

Both the z-scores and t-scores tell you how many standard deviations something is from the mean value. The z-distribution is the “standardized” normal distribution. The normal distribution is the familiar bell-shaped curve, symmetric, with almost all of the data falling within three standard deviations around the mean (Empirical Rule). For example, z = 1.5 tells you that the value is 1.5 standard deviations greater than the mean value. The t-distribution is also symmetric and mound shaped but its exact shape depends on the sample size. If the sample size is large (think 1,000 plus) then the z and t distributions are essentially the same. With smaller samples the t distributions has “fatter tails” and is more shallow and flat (compared to z). The t distribution is appropriate with continuous data when you do not know the population standard deviation (most real world cases).

Big payoff in information tends to be early in the information gathering process. On average, increasing samplze size will decrease size of interval. but with decreasing return, ie. once you get to 30 samples, you have to

quadruple the number of samples if you want the error to go down by half again.

Sampling methods

Recatch method = sample a population, release, then resample the population to estimate the size of the population. ie. to estimate the number of flaws in a building design, use two different groups of quality inspectors then compre how many each caufght and how many were caught by both teams.

Population proportion sampling = estimate what proportion of a population has a particular characteristic

Spot sampling = Variation of population proportion sampling. Taking random snapshots of people, processes, or things instead of tracking them constantly throughout a period of time. ie. to track share of time employees spend in a given activity, randomly sample people through the day to see what they are doing at that moment. If you find that in 12 instances out of 100 random samples, people were on a conference call, you can conclude they spend about 12% of the time on conference calls (90% CI is 8% to 18%).

Serial sampling = taking serail numbers to estimate production levels

Also look into clustered sampling and stratified sampling.

Outliers

t-statistic and z-statistic are types of «parametric» statistics, meaning they make some assumptions about the underlying distribution.

«power law» distributions have no definable mean.

If we sample income levels of individuals or the power of an earthquake, or the size of asteriods, we may find that 90% CI does not necessarily get narrower as sample size increases. Some samples will temporarily narrow the 90% CI, but some outliers are so much bigger that if they came up in the sample, they would greatly widen the CI.

Easiest way to determine how quickly estimates converge is to ask: «How big are the exceptions compared to most?»

ie. Reality-tv watching time is likely a highly skewed population with a lopsided distribution, meaning the median and mean can be different values. Estimating the median avoids the problem of nonconverging estimates.

Nonconverging data can be a challenge, especially with small samples. ie. Imagine surveying customers about how many hours per week they spend watching reality TV (their answers 0, 0, 1, 1, and 4. Spreadsheet would show lower bound as a negative value, which makes no sense.

«Power law» distributions can still be measured using «nonparametric» methods.

«mathless median» similar to the «rule of 5»:

Threshold probability

Uncertainty about the threshold can fall much faster than the uncertainty about the quantity in general.

How to esimate the chance that the median of a population is on one particular side of a threshold:

If we change a feature on a product and want to determine how much this affects customer satisfaction, we might need an experiment using a test group, control group, and baseline.

Productize «experiments» similar to A/B tests in Optimizely? Developer satisfication based on referring a friend? Like NPS?

Statistical significance

Statistical significance is not «what is the probability that the drug works?». It is «given that what we observed was a fluke, what is the chance we would observe this difference or an even bigger difference?»

If the p-value is less than some previously stated threshold–ie. 0.01– then they reject the null hypothesis.

Significance testing is an artifact of the previously-mentioned frequentist view of probability. This definition of probability–an idealized frequency limit of a purely random, strictly repeatable process over infinite trials–is virtually impossible to apply to real-world problems like the probability of the success of a new product or effectiveness of a new drug.

Terms related to statistical significance:

WTTP = «willingness to pay» = method of valuing things by seeing how much people are willing to pay.

Quantifying performance

I know what to look for, but how do I total all these things?

Terms like «performance» and «quality» are used ambiguously. Generally, clients can provide a list of separate observations they associate with performance (ie. «gets things done on time», «gets positive accolades from clients», «error-free modules completed per month») but don’t know how to consolidate it into a single measurement.

This is a problem of how to tally lots of different observations in a total «index» of some kind.

One method is a utility (or «indifference») curve which equalizes two measurements:

Another method is to collapse different considerations into a «certain monetary equivalent (CME)». CME of an investment is the fixed and certain dollar amount that someone considers just as good as the investment.

Could be used to combine multiple parameters describing quality into one monetary quality value.

What does performance mean if not a quantifiable contribution to the ultimate goals of the organization? Try to reduce factors down to the ultimate goal, ie. profit or shareholder value maximization problem. Examples of how people defined some form of «performance» as a quantifiable contribution to some ultimate goal:

The ultimate measurement instrument: human judges

Human mind has remakable advantages over mechanical measurements for assessing complex and ambiguous situations. Tasks such as recognizing one face or voice post great challenges for software but are trivial for a five-year-old.

We need to exploit the strengths of the human mind as a measurement instrument while adjusting for its errors.

Linear models

Weighted scores are one way to estimate relative items like «buusiness opportunities».

Simple linear models outperform human experts in many cases.

Normalized z-score outperforms linear weight scoring methods because it solves inadvertent weighting, ie. if one criteria is always a 4 out of 10, then another criteria that varies more will be weighted more highly.

«Invariant comparison» principle says that if one measurement instruments says A is more than B, then another measurement instrument should give the same answer. ie. someone may do better on one IQ test than another.

Another version of invariant comparison = when there are too many individuals for each judge to evaluate so they are divided up among the judges. ie. if you wanted to evaluate proficiency of project managers based on their performance on various projects. variance in grading style of judge and difficulty of project would mean the comparison of PMs would not be invariant of who judged them or the projects they were judged on.

Rasch model = predicted chance that a subject would correctly answer a true/false question based on (1) the percentage of other subjects in population who answered it correctly, (2) the percentage of other questions that the subject answered correctly. This can be used in the PM scenario to remove variance due to judges and project difficulty.

Lens model: uses implicit weights derived from decision makers to crete a formula. Experts know what to look for but can apply it with great inconsistency.

Could this be productized for engineering performance reviews?

A disturbing trend in management is to develop a type of weighted score where the score and weight are both subjective scales with arbitrary point values, ie. for rating a proposed project via 1 to 5 rating in categories such as «strategic alignment» and «organizational risk». These introduce additional errors for six reasons:

When experts select weights on a scale of 1 to 5, it’s not necessarily clear that they ineterprent a 4 to mean twice as important as a 2.

Prediction markets

How to Measure Anything by Douglas W. Hubbard

Some Quotes from the Author with my Notes, Thoughts, and the Occasional Opinion

Book Title	Author	Purchased	Digital or Physical	Book Resources
How to Measure Anything	Douglas W. Hubbard	Yes	Physical	Link

When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the sate of science.

Douglas W. Hubbard

Anything can be measured. If something can be observed in any way at all, it lends itself to some type of measurement method. No matter how «fuzzy» the measurement is, it’s still a measurement if it tells you more then you knew before.

Douglas W. Hubbard

Like many hard problems in buseiness or life in general, seemingly impossible measurements start with asking the right questions. Then, even once questions are framed the right way, managers and analysts may need a practical way to use tools to solve problems that might be perceived as complex.

Douglas W. Hubbard

It is simply a habit to default to labeling something as intangible.

Douglas W. Hubbard

Why do we care about measurements at all?

Douglas W. Hubbard

Unless someone is planning on selling the information or using it for their own entertainment, they shouldn’t care about measuring something if it doesn’t inform a significant bet of some kind. So don’t confuse the proposition that anything can be measured with everything should be measured.

Applied Information Economics: A Universal Approach to Measurement

Success is a function of persistence and doggedness and the willingness to work hard for twenty-two minutes to make sense of something that most people would give up on after thirty seconds.

Douglas W. Hubbard

The lesson for businesses is to avoid the quagmire that uncertainty is impenetrable and beyond analysis. Instead of being overwhelmed by the apparent uncertainty in such a problem, start to ask what things about it do you know.

Douglas W. Hubbard

Executives often say, «We can’t even begin to guess at something like that.» They dwell ad infinitum on the overwhelming uncertainties. Instead of making any attempt at measurement, they sometimes prefer to be stunned into inactivity by the apparent difficulty in dealing with these uncertainties. Yes, there are a log of things you don’t know, but what do you know?

Douglas W. Hubbard

Usually things that seem immeasurable in business reveal themselves to much simpler methods of observation, once we learn to see through the illusion of immeasurability.

The three reasons why people think that something can’t be measured:

In addition to these reasons why something can’t be measured, there are also three common reasons why something shouldn’t be measured.

The Concept of Measurement

Douglas W. Hubbard

If we incorrectly think that measurement means meeting some nearly unachievalbe standard of certainty, then few things will be measurable even in the physical sciences.

Definition of Measurement

Measurement: A quantitatively expressed reduction of uncertainty based on one or more observations.

Douglas W. Hubbard

The receiver of information could be described as having some prior state of uncertainty. That is, the receiver already knew something, and the new information merely removed some, not necessarily all of the receiver’s uncertainty.

This «uncertainty reduction» point of view is what is critical to business. Major decisions made under a state of uncertainty—such as whether to aprovove large information technology (IT) projects or new product development—can be made better, even if just slightly, by reducing uncertainty. Such an uncertainty reduction can be worth millions.

A Varity of Measurement Scales

Douglas W. Hubbard

So, a measurement doesn’t have to eliminate uncertainty after all. A mere reduction in uncertainty counts as a measurement and can potentially be worht much more than the cost of the measurement.

The Object of Measurement

A problem well stated is a problem half solved.

Douglas W. Hubbard

Once mangers figure out what they mean and why it matters, the issue in question starts to look a lot more measureable.

The clarification chain is just a short series of connections that should bring us from thinking of something as an intangible to thinking of it as intangible.

Questions to ask in order to get to the Clarification Chain:

Douglas W. Hubbard

Business managers need to realize that some things seem intangivle only because they haven’t defined what they are talking about. Figure out what you mean and you are half way to measuring it.

Источники информации:

Как делать своими руками…

How to measure anything

How to measure anything

How to measure anything

Книга Дугласа Хаббарда «Как измерить всё, что угодно»: часть 1

Глава 1. Нематериальное и проблема его измерения.

Глава 2. Интуитивное умение измерять всё: Эратосфен, Энрико и Эмили

Глава 3. Почему неизмеримость нематериального — всего лишь иллюзия

Глава 4. Формулирование задачи по измерению

Глава 5. Калиброванные оценки: что вам известно уже сейчас?

Глава 6. Оценка риска: введение в моделирование методом Монте-Карло

Глава 7. Оценка стоимости информации

How to Measure Anything. Finding the Value of Intangibles in Business

Скачать книгу

О книге «How to Measure Anything. Finding the Value of Intangibles in Business»

Book Notes by Abi Noda

Illusion of intangibles

Defining what you want to measure

How to measure