Which measurement would a researcher use to test for reliability when the data are in dichotomous yes no format?

INTRODUCTION

The Ministry of Health Malaysia has been striving to improve patient oriented pharmaceutical care services to promote health, upgrade the dispensing system and to increase patients’ satisfaction towards the services. For chronic diseases, prescriptions with multiple drugs are usually ordered for more than a month’s supply. Hence, partial drug supply will be given for these cases in line with the Quality Use of Medicines practice in public healthcare facilities.1 Subsequent supply is provided either by the conventional pharmacy counter or through the Pharmacy Appointment System in the following months until all supplies are fully satisfied. Pharmacy Appointment System is an alternative to the conventional dispensing system in which patients can use mediums such as phone, short messages, e-mail and fax to inform the date of appointment for patient’s follow-up medicine supplies.

Pharmacy value-added services in Malaysian context can be defined as any pharmacy activities or practices introduced or initiated by pharmacists (or pharmacy staff) through innovation and creativity to improve the delivery of pharmaceutical care to patients. The objective of the pharmacy value-added services in the Malaysian public health sector is to facilitate refill of medications, to reduce waiting time and to increase patient convenience.2 The most common pharmacy value-added services in Malaysia are Integrated drug dispensing systems, drive through pharmacy, medicines by post 1Malaysia (UMP 1Malaysia), SMS and take, Email and take, Telephone and take, Fax and take and Appointment card.

Pharmacy Value-Added Services in Malaysia

One of the most recommended pharmacy value-added services is the Pharmacy drive through service. This service provides monthly partial supplies to patients via the drive through pharmacy counter. Patients are given an appointment date prior to drug collection. On the scheduled date, patients will collect their partial medicine supplies at drive through pharmacy counter only. However, if patients wish to switch to other pharmacy value-added services in near future, they are allowed to do so without any processing charges. However, patients must inform the pharmacist earlier to allow time to transfer personal files from one service center to another.

Besides drive through pharmacy, medicine by post 1Malaysia which is known as UMP 1Malaysia was initiated to deliver monthly medications to patients designated destination via postal service with a nominal postal fee being imposed to the patient by Pos Malaysia. UMP 1Malaysia main objectives are to eliminate patient’s waiting time in clinic, promote continuation of medications to patients and to increase drug compliance in patients. This service is free of any administrative or pharmaceutical service charges.

Integrated drug dispensing system is another dispensing system whereby patients are allowed to refill their partial supplies freely from any government health facilities that are listed under the Ministry of Health Malaysia Integrated Drug Dispensing System Directory throughout Malaysia. This system enables patients to refill their subsequent medication supply in facilities mostly conveniently located to their home, office or villages.

The SMS and take is another service provided for patients who want to collect follow-up medication in government healthcare system. This service requires patients to send short messages (SMS) about their details and the desired date and time of collections to the pharmacist in charge. Upon arrival at the counter, the patients only need to show their prescription numbers and collect the supply without having to queue or pay a service fee. This initiative was pioneered to reduce waiting time, increase delivery performance and improve patient satisfaction.

Generally, patients who are clinically stable, on long term medications, well-counselled and not taking psychotropic drugs are suitable to use pharmacy value-added services. The key point for the Malaysian context is that pharmacy value-added services are almost solely provided by the government through the Ministry of Health Malaysia. Patients who wish to visit medical officers in government healthcare facilities must register themselves at the registration counter. The registration fee is waived for all government servants and retired citizens. However, for patients who are self-employed or are working with private sectors, a registration fee of MYR 3 is imposed in order to visit a medical officer while MYR 5 is subjected for visiting a specialist or a consultant. Only a registered patient can use the pharmacy services. Medicine supplies from the government pharmacies are free to patients. Patients who use pharmacy value-added services do not have to pay any pharmaceutical service charges.

Extended & Improved Pharmacy Services

Value-added pharmaceutical products and services through innovations have introduced worldwide under many different forms. Several countries are offering various extended or improved pharmacy services as a result of the changing roles and challenges faced by the pharmacists. Taipei Medical University-affiliated Shuang Ho Hospital implemented the first drive-through pharmacy service in Taiwan in July 2011, opening a four-lane drive-through near the hospital building.3 The implementation of the Shuang Ho Hospital drive-through service increased the overall prescription refilling rate, online reservation usage and proportion of medications picked up within a six month period. In Australia, one-stop-shop, forward dispensing, “Rolls Royce service”, e-prescribing, chronic illness card, prescription reminder systems, drive-through, pick up and home delivery services are a few examples of the innovations and expectations of consumers from their community pharmacies.4 E-prescribing which supplies prescription via online system is highly demanded by Australian patients as this system is viewed to reduce pharmacy queues and prescription related paperwork.4 At the same time, more and more consumers prefer home delivery services and drive-through services over face-to-face or going-into-pharmacy medicine pick-up in Australia.5

In the United States, outpatient drugs are dispensed through both community and mail order pharmacies.6 Mail order pharmacy services historically existed primarily for delivery of medication to rural or remote areas.7 Pharmacy benefit managers through expanded services routinely offer drug formulary development, specialty pharmaceutical distribution and mail order prescription delivery options to clients to control prescription drug cost.8 Instead of widespread perception of lower prices via mail order versus community pharmacies, several studies have discovered mixed and opposite findings regarding the potential savings in cost to both patients and insurers.6,8 Instead, some generic drug prices are higher through mail order pharmacy. It is also found that the loss of copayments in mail order service benefit was greater than the savings on ingredient costs and dispensing fees even though mail order pharmacy is considered less expensive for patients overall.9

While cost containment is still an on-going debate, mail order pharmacy services provide evidence of improving patient’s adherence to medications and better control of disease state. In a recent study in the United States, patients who received medications by mail had better adherence to antiglycemic10, antihypertensive, lipid-lower medications and better LDL-cholesterol control11 when compared with patients who obtained refills at community pharmacies.12 It was also found that patients who switched to mail order pharmacy had higher medication possession ratios and trended toward lower total and diabetes-related medical costs over time.13 Better patients’ satisfaction were noticed for customers who use mail order pharmacy14 compared to traditional pharmacy with highest satisfaction related to phone service, technical competence and turnaround time dimension.15

Scenario in Malaysia

Pharmacy value-added services is still a new concept to many Malaysians who have not been exposed to these services. In fact, not many patients adopt these services despite it being free of service charges except the home delivery postal fee. It is important to note that at this current stage not all government pharmacies provide all types of pharmacy value-added services as the provision of these services is subject to their respective facilities. However, most public pharmacies offer at least two to three types of pharmacy value-added services on a daily basis for local consumers. Published national guidelines or handbook about pharmacy value-added services implementation, protocols and detail descriptions of various pharmacy value-added services are still lacking and there is no documented research on the various aspects of pharmacy value-added services.

To date, studies that explore the public’s knowledge and perspectives towards pharmacy value-added services in Malaysia are scarce. At the authors’ best knowledge, no published studies were found that examine patients intention to adopt pharmacy value-added services using an extended Theory of Planned Behavior (TPB) as the theoretical model. This article is part of a major study to investigate possible predictors that affect the public’s intention to adopt pharmacy value-added services.

The purpose of this article is to develop, translate and validate a questionnaire instrument within the Malaysian context using emerging salient themes generated from face-to-face interviews in an earlier exploratory study. This paper aims to discuss the development, reliability and validity issues of the Pharmacy Value-Added Services Questionnaire (PVASQ). A brief introduction of the TPB is discussed after this.

Theory of Planned Behavior

The TPB is an extension of the Theory of Reasoned Action16,17 and is underpinned by the assumption that human behavior is essentially rational and that the immediate antecedent of any behavior is intention. Intention (INT) refers to the motivational factors that influence a given behavior where the stronger the intention to perform the behavior, the more likely the behavior will be performed. As a general rule, the stronger the intention to engage in a behavior, the more likely should be its performance.18 In the original TPB model, the dependent variable, intention (INT) is predicted by three conceptually independent variables; attitudes (ATT), subjective norms (SN) and perceived behavioral control (PBC). ATT refers to the degree to which individuals have a favorable or unfavorable evaluation of the behavior. SN refers to the social pressure individuals perceive with regard to whether or not they are expected to behave in a particular way. PBC refers to the perception of internal and external resource constraints on performing the behavior. The TPB causal chain implies that altering behavioral beliefs might change the level of intention to perform a behavior, of which in this study, intention to adopt pharmacy value-added services is our main outcome concerned. Given the efficacy of the model, this study was to operationalize TPB constructs to predict the intention to use pharmacy value-added services by patients who collect monthly partial medication supply.

According to Ajzen18,19,20, TPB is open to the inclusion of additional predictors if it can be shown to capture a significant proportion of the variance in intention or behavior after the theory’s current variables have been taken into account.18 From our qualitative study in an earlier stage, we noticed that a majority of patients did not know about pharmacy value-added services being offered in the public healthcare facilities. At the same time, patients expressed higher intention to use pharmacy value-added services if they knew about it through pharmacy counters or advertisements. Patients informed that they have high expectations towards the improvement of pharmacy value-added services in order for them to use these services in the future. Hence, we postulate that knowledge about the existence and the benefits of pharmacy value-added services in patients might influence intention to use the services. We also postulate that patients’ expectations will affect the intention to use pharmacy value-added services. Therefore, we included these two variables (knowledge and expectations) apart from the original three variables (ATT, SN, PBC) into our study to improve the prediction of intentions. Figure 1 depicts the conceptual model in our study, borrowing framework from the TPB by Ajzen.18

Which measurement would a researcher use to test for reliability when the data are in dichotomous yes no format?

Figure 1

Conceptual model of VAS study.

Adaptation to Theory of Planned Behavior Model. The first three independent variables (Attitudes*, Subjective norms* and Perceived Behavioral Control*) are the original predictors in TPB model; Intention* serves as dependent variable in the model. Additional predictors (Knowledge and Expectations) were incorporated into model for model testing purposes.

The TPB provides a solid framework for exploring the patient’s intention in many health related behaviors.18 One of the major strengths of this theory is that since its introduction 26 years ago19, it has become one of the most frequently cited and influential models for prediction of human social behavior.20,21 This parsimonious model has been applied to a wide range of behaviors including internet purchasing behavior22, condom use23, exercise intention24, intention to leave school early25 and digital piracy behavior.26

METHODS

Stage 1: Qualitative exploration; Interviews

Proposed by Ajzen and Fishbein16, a critical step in any TPB study involves conducting elicitation interviews with people from similar backgrounds and characteristics to those in the sample. Therefore, this questionnaire instrument was developed using salient beliefs generated from face-to-face interviews. Interviewees were patients who collected partial supply from public healthcare facilities in the state of Negeri Sembilan, Malaysia and were recruited using snowball sampling. Purposive approaches were initially applied to identify three participants with diverse social backgrounds. The first three participants were a Malay, a Chinese and an Indian patient who have different educational levels and income levels. Later the three participants were asked to locate other potential participants from their own networks who met the selection criteria. Inclusion criteria for the interview includes: at least 18 year old, collecting medicine supply from a government pharmacy, and a Malaysian. The recruitment continued until the saturation point had been reached, whereby no more new and relevant ideas or themes emerged from the interviews. Data saturation was observed at the 11th interview but the interview continued to the 12th participant to confirm the saturation. Prior to the interview, the principle investigator who was the interviewer received training and coaching from experts from Discipline of Social and Administrative Pharmacy, Universiti Sains Malaysia. The interviews were audiotaped and transcribed verbatim for thematic content analysis. Important themes were extracted from the interview and later served as important variables to help operationalize TPB constructs. These interviews were used to identify other new variables that may influence patients’ intention to use pharmacy value-added services to collect partial medicine supply.

A semi-structured interview guide was used for all interviewees with probing of questions between conversations to clarify the meanings of responses and gain insight into the topic discussed. Four types of questions were asked in the interview27: 1) Positive and negative feelings about using pharmacy value-added services; 2) Positive and negative attributes or outcomes of using pharmacy value-added services; 3) Influential individuals or group of people who are in favor of or opposed to their behavior to use pharmacy value-added services; 4) Situational or environmental facilitators and barriers (obstacles) that make it easy or difficult to use pharmacy value-added services. Apart from these TPB related questions, other questions were casually asked and discussed to gain better understanding into topics and areas related to pharmacy services. However, detailed descriptions of this conversation are out of the scope of this article.

Most participants were comfortable conversing in English with only two participants requesting to use the Malay language in the middle of the interview. Therefore, some vernacular words were translated along with the interview transcriptions. All interviewees were briefed at the beginning of the interview and debriefed after the session ended. The purpose of the interview session, informed consent form, the use of voice recorder during interview and the nature of the conversation were explained to interviewees. All participants were given time and opportunities to voice questions before the interviews started. At the end of the interview, all participants were asked if they have any additional information and experience to share which were not covered earlier. Voluntary informed consent (written) to participation, confidentiality, and anonymity were guaranteed. Monetary rewards were not part of the agreement, however, given local cultural expectations and practice, a small token of appreciation (a box of cookies) was given to each participant on completion of the interview. All studies were registered with the National Medical Research Register and approved by the Medical Research & Ethics Committee, Ministry of Health Malaysia. The registration number of the protocol is NMRR-14-483-20556.

Stage 2: Questionnaire Constructs Development

The questionnaire was developed in English using salient findings generated from the qualitative interview. We constructed the TPB research tool based on the TACT principles.28 The target behavior is defined carefully in terms of its target, action, context and time (TACT). The variables in the model reflect psychological constructs and have special meaning within the theory. Although there is not a perfect relationship between behavioral intention and actual behavior, intention can be used as a proximal measure of behavior.28 The variables in the TPB model may be measured directly or indirectly. Only direct measures were used in the study. Items that utilize indirect measures were found to be cognitively burdensome to many respondents.

Initially, the Pharmacy Value-Added Services Questionnaire (PVASQ) was developed in English. It served as the first draft of the questionnaire instrument containing 36 questions and was divided into 3 domains and 1 demographic section as detailed below:

  • A.

    Knowledge Scale: 7 items, dichotomous response options (Yes/No).

  • B.

    Perspective Scale: 15 items, 7-options unipolar Likert response format from strongly disagree (1) to strongly agree (7). It should be noted here that this scale encompasses TPB constructs (ATT, SN, PBC and INT).

  • C.

    Expectations Scale: 7 items, 7-options unipolar Likert response format from strongly disagree (1) to strongly agree (7).

  • D.

    Demographic: 7 items.

Stage 3: Translation and back-translation of Pharmacy Value-Added Services Questionnaire (PVASQ)

We translated the English survey instrument into Malay language as this is our respondents’ native language. The aim of the translation is to increase the response rate by facilitating respondents’ comprehension and reducing cognitive burden. The translation process involved forward and backward translation. Back-translation was used to confirm meaning equivalence of the original text with the translated version. First, the principle investigator of this research study who is knowledgeable and fluent in both English and Malay, forward translated the instrument from English to the Malay language. In the second step, another independent researcher who was blinded to the original English version of the questionnaire, back-translated all questions from Malay to English. One faculty expert and one senior pharmacist researcher evaluated the meaning equivalency between the two versions. Further reviewed by a group of experts consisting of healthcare providers, each item was assessed and evaluated for appropriateness and readability. All experts agreed that both versions were acceptable, easily understood and meaningfully translated. Only minor spelling corrections were made after the detection. This Malay questionnaire consists of 3 domains and 1 demographic section was organized in the following sequence: A) Knowledge, B) Perspective, C) Expectations and D) Demographic. There were a total of 36 questions in the Malay language questionnaire, of which 29 questions were items testing hypothesis variables and the remaining 7 questions comprises of demographic questions. The questionnaire instrument in English is included in the

.

Stage 4: Pre-test of instruments

The Malay PVASQ with a cover page was pre-tested with N=15 respondents recruited conveniently from the Seremban Health Clinic. The respondents were asked to self-administer the questionnaire in a pen-and-paper manner. Respondents were briefed about the purpose of the pretest, the type of questions they will be asked and the length of the questionnaire. They were told to read the questions on their own without discussing with the person sitting next to them. They were briefed to highlight those questions which they think were ambiguous or were difficult to comprehend. Respondents were also asked to provide comments and feedback on the relevance, clarity and the length of the questionnaire. Respondents were given ample time to complete the questionnaire. Verbal consent was given by every respondent as an agreement to participate in the pretest. Face validity was checked, minor adjustments were made to further improve clarity and all items were retained.

Stage 5: Test-retest and Field Data Collection

The final version of the Malay PVASQ was tested on a group of twenty five respondents who were drawn conveniently from two public healthcare facilities (Tuanku Ja’afar Seremban Hospital and Seremban Health Clinic) who fulfilled the inclusion criteria. The inclusion criteria for the questionnaire survey from pretest to field data collection remained the same as the criteria in the interview earlier with additional condition that participants must be able to self-administer the questionnaire. Participants who did not bring their spectacles and could not read without them were excluded. This intra-rater reliability test required the respondents to self-administer the same questionnaire instrument at two time points with an interval of one week between the two tests. Participants were briefed about the purpose of the tests and were asked to administer a pen-and-paper Malay version questionnaire. All participants gave their verbal consent before administering the questionnaire. No monetary rewards were promised in the test-retest, however, a small token of appreciation (a pill box) was given to every participant at the end of the tests. All participants (N=25) completed all questions at two time points. Data were collected from the end of October 2014 to early November 2014.

For field data collection, 460 questionnaires in Malay and English language were distributed from mid-November 2014 to the end of December 2014. The sample size was calculated using sample size table from Krejcie & Morgan.29 Due to the lack of a central health database in Negeri Sembilan, the number of patients who collect partial supply medicine in this state is not available. However, the population aged beyond 15 years old from the state of Negeri Sembilan in 2013 obtained from the Department of Statistics Malaysia was reported as 806,200. Using N=806,200 as population size, the sample size required for the quantitative study was 384 subjects. After taking into consideration of the 20% non-response rate, a minimum sample size of 460 questionnaires was used. Paper and pen questionnaires were self-administered by patients who visited these five government healthcare facilities: Tuanku Ja’afar Seremban Hospital, Seremban Health Clinic, Ampangan Health Clinic, Senawang Health Clinic and Seremban 2 Health Clinic. Consents were considered given when patients did not reject the survey invitation. Participants agreeing to the research completed the questionnaire on the spot and were told to return the questionnaire to the study investigator at the pharmacy counter or in the collection box.

Stage 6: Statistical Analysis: Internal Consistency, Test-retest reliability and construct validity

Data entry and analysis was performed using SPSS version 22. The significance level was set at p<0.05. Demographic data was presented in numbers and percentages. Internal consistency was assessed using Cronbach’s Alpha. The stability of the construct measures were assessed by test-retest reliability.30 The chance correlated agreement reliability for twenty five respondents at two time points was calculated using Cohen’s unweighted kappa statistic for nominal scale and Intraclass Correlation Coefficient (ICC) was applied in interval scale. The ICC model used is the One-way random effects model, single measures.31

Confirmatory factor analysis (CFA) was performed to assess the construct validity of PVASQ. CFA examined the construct validity using principal axis factoring and Varimax rotation with Kaiser normalization. The TPB postulates three conceptually independent determinants of intention.18 Therefore, the decision to choose Varimax rotation is well supported.32 Instead of relying on eigenvalues to provide the domains for our constructs, we restricted the extracted factors to a fixed number.33 In this case, we restricted the extracted factors to four based on the three TPB constructs of attitudes, subjective norms and perceived behavioral control and the fourth construct of expectations. This is therefore consistent with our TPB model and the findings of our initial qualitative investigation which revealed the role of expectations. Items in the intention scale were not included into factor analysis because intention is a dependent variable in the TPB model. The knowledge scale is a dichotomous variable which is unsuitable for factor analysis simply because binary variables cannot be expressed within factor models.34 Thus, the Intraclass Correlation Coefficient by test-retest is sufficient to indicate construct validity for the knowledge scale.

RESULTS

Reliability and Internal Consistency

A total of 25 respondents participated in the study, and there were more women (80%) than men (20%) in the sample. A majority of the respondents were Chinese and aged from 41-50 (36%), 51-60 (36%) and older than 60 (24%). About 40% of respondents completed primary school, 32% had secondary schooling, and 12% completed a high school education at the equivalent of A-Levels. Sixty percent of the participants were housewives and 28% were retired citizens. There were twenty respondents (80%) reported not having any income while 16% had a monthly salary between MYR 1- MYR 2000. Respondents who received pocket money from their spouses were included in the “no income or not relevant” category. Respondent’s demographic and characteristics are shown in Table 1.

Table 1

Which measurement would a researcher use to test for reliability when the data are in dichotomous yes no format?

Respondents’ characteristics and descriptive statistics (N=25)

The reliability of the Malay Language PVASQ was established by testing both consistency and stability. Acceptable Cronbach’s alpha coefficient was set a priori at 0.70 in this study. The alpha for the pooled 29 items in both Test and Retest were alpha=0.912 and alpha=0.908 respectively, which has exceeded the pre-set value. All TPB constructs have alpha values more than 0.70 in both Test and Retest with the attitudes construct showing alpha=0.939 and alpha=0.945 in both cases. Cronbach34 suggested that if several factors exist in the questionnaire, the formula should be applied separately to items relating to different factors. All α values are shown in Table 2.

Table 2

Which measurement would a researcher use to test for reliability when the data are in dichotomous yes no format?

Cronbach’s Alpha values for all scales and TPB constructs.

In this study, individual items in the knowledge scale were measured using dichotomous response options of “Yes” and “No” with score one credited for a “Yes” answer and zero score for a “No” answer. The total score for the knowledge scale is the sum score of all seven individual items in this scale and has a range from zero to a maximum of seven. This knowledge composite score is treated as interval data.35 Both expectation and perspective scales operated using Likert scales of 7-point Likert items starting from 1 (strongly disagree) to 7 (strongly agree). The perspective scores were the sum score of all items in this scale and ranges between 67 (minimum) to 105 (maximum) points. The expectation score is calculated in the same manner and had a minimum score of 40 to a maximum of 49 in the score range. Both perspective and expectation scores were treated as interval scale as suggested by Brown.35 The statistical decision was made upon treating Likert scale data as an interval scale.35,36 With this, ICC was used as a statistical tool to analyse the reliability of knowledge, perspective and expectation scales. Since knowledge items were constructed initially with binary responses, in addition to ICC, unweighted kappa was used to demonstrate the test-retest reliability of individual items. Table 3 depicts the kappa coefficients of 7 items in the knowledge scale. The kappa coefficients in the knowledge scale range from 0.503-0.905, indicating a moderate to almost perfect strength of agreement between test and retest. Unweighted kappa statistics were used for nominal data.

Table 3

Which measurement would a researcher use to test for reliability when the data are in dichotomous yes no format?

Unweighted Kappa Coefficient for all items with binary responses within Knowledge scale (N=25).

The ICC coefficients calculated using model 1, single measurements are shown in Table 4. The ICC for all scales tested for intra-rater (test-retest) reliability was good. It was found the ICC [1,1] (knowledge)=0.872, with [95%CI 0.733:0.941]; ICC [1,1] (perspective)=0.990, with [95%CI 0.978:0.996]; ICC [1,1] (expectation)=0.967 with [95%CI 0.927:0.985]; ICC(ATT) [1,1]=0.998 with [95%CI 0.996:0.999]; ICC(SN) [1,1]=0.979 with [95%CI 0.954:0.991}; ICC(PBC) [1,1]=0.943 with [95%CI 0.876:0.974]; ICC(INT) [1,1]=0.985 with [95%CI 0.967:0.993]. Hence, there is evidence for the repeatability for construct measurements between two time points.

Table 4

Which measurement would a researcher use to test for reliability when the data are in dichotomous yes no format?

Reliability of Test-retest using Intraclass Correlation Coefficient.

Construct Validity: Factor Analysis (N=410)

From the 460 questionnaires distributed, 410 useable questionnaires with no missing values were retrieved with a response rate of 89.1%. The Kaiser-Meyer-Olkin measure of sampling adequacy for the factor analysis was 0.940. Bartlett’s Test of Sphericity reported a significant Chi-square value of 5454.723 with p<0.05. Therefore, CFA is likely to yield distinctive and reliable latent factors and highly suitable for the analysis of this dataset. Factor analysis extracted four factors with cumulative explained variance of 71%. Rotation of items (Table 5) showed strong loading (>0.60) of seven items in one factor (Expectations) and significant loading levels (>0.40) on the second factor (Attitudes), third factor (Perceived Behavioral Control) and fourth factor (Subjective Norms). Almost all 19 items were regrouped distinctively into expected four factors except one item which loaded from the Attitudes factor into Perceived Behavioral Control factor. This item “P8 Home delivery reduces transportation cost” has low extracted communalities at 0.271 and loaded inaccurately, therefore was removed from the final analysis which is not discussed in this paper. All other items have extracted communalities above 0.30 and 15 out of 19 items having >0.60, therefore these items were retained.

DISCUSSION

Internal consistency

Cronbach’s alpha coefficient values above 0.70 are considered acceptable; values above 0.80 are preferable.37 For cognitive tests such as intelligence tests, acceptable value of 0.80 is appropriate while a cut-off point of 0.70 is more suitable for ability tests.38 With these reference standards, the study showed that the questionnaire instrument constructs were consistent as all constructs showed alpha>0.70 at both test-retest especially that perspective and expectation scales demonstrated alpha>0.90. It is also shown that the items assessing the ATT subscale are consistent with alpha=0.939 and 0.945 respectively at both test and retest. SN and PBC subscales both show alphas near 0.9 at both test-retest. The alpha value increases as the inter-correlation between items and the number of items increase.39 A very high level of alpha value is however, suggestive of lengthy scales and the possibility of parallel items. Although the questionnaire had been pilot tested and the length of the questionnaire was reported as acceptable by respondents, it is suggestive to reduce some items in the future study to shorten the length of the questionnaire.

A common interpretation of α is that it measures “unidimensionality” which means the scale measures one underlying factor or construct.32 Therefore, α is a measure of the strength of a factor.39 With the alpha-values shown, the study demonstrates that the overall questionnaire is reliable and consistent over time and therefore is valid as well. Many studies found evidence of good reliability using the TPB to construct questionnaires in social and health related research. For example, Torres-Harding, Siers, and Olson40 demonstrated high alpha coefficient alpha=0.93 for the entire 44-item Social Justice Scale with alpha=0.89, alpha=0.85 and alpha=0.77 respectively for ATT, SN and PBC. In a sleep hygiene investigation among university students in Australia, the reported α values for ATT, SN, PBC and INT were alpha=0.92, alpha=0.87, alpha=0.83 and alpha=0.84.41

Reliability of test-retest: Intra-rater agreement

Researchers must first be able to differentiate the conceptual and practical applications of correlations and intra-rater agreement. We provide a rational argument to show that correlations do not and should not be considered a satisfactory metric for the purpose of establishing test-retest reliability. While many estimators of the measure of agreement between two dichotomous ratings of a person have been proposed, Blackman and Koval42 further explain that in the absence of a standard against which to assess the quality of measurements, researchers typically require that a measurement be performed by two raters (inter-rater reliability) or by the same rater (intra-rater reliability) at two points in time. The degree of agreement between these two ratings is then an indication of the quality of a single measurement. Thus, implying test-retest reliability by the means of stability across time.

The measure of agreement known as kappa is intended as a measure of association that adjusts for chance agreement.43 The assumptions of Cohen’s kappa of the coefficient of agreement is that the units are independent, the categories of the nominal scale are independent, mutually exclusive and exhaustive, and the judges (raters) operate independently. Kappa value scales vary from -1 to +1 of which a negative value indicates poorer than chance agreement, and a positive value indicates better than chance agreement, with a value of unity indicative of perfect agreement.44 The following standards for strength of agreement for kappa coefficient was proposed: 0 or lower=poor, 0.01-0.20=slight, 0.21-0.40=fair, 0.41-0.60=moderate, 0.61-0.80=substantial, and 0.80-1.0 =almost perfect.45 Using the standards suggested by Portney and Watkins46, another set of statistical significance was defined: ICC<0.50 (low), ICC: 0.50 - 0.75 (moderate), ICC>0.75 (good). One concern with kappa is that it was designed for nominal random variables, therefore in cases of ordinal data the seriousness of a disagreement depends on the difference between the ratings.47

The weighted kappa coefficient is probably the most useful method for agreement for ordinal data, but several issues of concern arise from using this method in analysis.48 It was explained that the problem with the kappa statistic is that the kappa value depends on the prevalence in each category, which leads to difficulties in comparing the kappa values of different studies with different prevalence in the categories. Many factors can influence the magnitude of kappa, with the most common being the prevalence, bias and non-independence of ratings.49

To remedy this problem, Fleiss and Cohen44 suggested that the Intraclass correlation coefficient (ICC) is the mathematical equivalent of the weighted kappa for ordinal data and pointed that the ICC is the special case of weighted kappa when the categories are equally spaced points along one dimension. Other literature also supported that the ICC can also be used for ordinal data with equal distance between intervals.46 Some other researchers demonstrated that to analyze the reliability of data obtained with original continuous scale, methods such as ICC, the standard error of measurement, or the bias and limits of agreement can be used.48,49 In some widely used numerical scales of psychopathology, ICC has shown to produce a reliabilities interval between 0.70-0.90.50

Jakobsson and Westergren48 pointed out that some researchers used correlation as a measure of agreement. Correlation, like the chi-square test, is a measure of association and does not satisfactorily measure agreement.51,52 Association can be defined as two variables that are not independent, while agreement is a special case of association where the data in the diagonal (perfect agreement) are of most interest. Therefore it should be noted that perfect association does not automatically mean perfect agreement because a perfect correlation (r=1.0) can be obtained even if the intercept is not zero and the slope is not 1.0. To illustrate this clearly, Jakobsson and Westergren showed an example when one of the observers constantly grades the scores a little higher than the other observers. This will give a high association but low agreement. Thus, correlation does not account for systematic biases. Furthermore, the correlation coefficient tends to be higher than the “true” reliability.53 Therefore in this study, kappa and ICC were utilized to establish test-retest reliability. The results indicated evidence for the repeatability for construct measurements between two time points. Therefore, it is concluded that the intra-rater reliability test was established.

Confirmatory Factor Analysis: Construct Validity

Guidelines regarding sample size for factor analysis vary dramatically from researcher to researcher, with a suggested ratio of 5 participants per measured variable and that sample size never be less than 10054 to a proposed ratio of 10 to 1.55,56 Comrey and Lee57 urge researchers to obtain samples 500 or bigger whenever possible and offered a rough rating scale for sample size adequacy for factor analysis: 100=poor, 200=fair, 300=good, 500=very good, 1,000 or more=excellent. The SPSS software package uses Barlett’s test of sphericity and the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy to assess the adequacy of the correlation matrices for factor analysis.

For a large sample, Barlett’s test approximates a chi-square distribution and therefore forms a bottom line test for large samples, but is less reliable in small samples. It is noted that very small values of significance (below 0.05) indicate a high probability that there are significant relationships between the variables, while higher values (0.10 and above) indicate the data is inappropriate for factor analysis. The KMO measure of sampling adequacy provides an index (between 0 and 1) of the proportion of variance among the variables that might be common variance. A value of KMO near 1.0 supports a factor analysis while a value of less than 0.50 suggests that data is not useful to be performed a factor analysis. In other words, a value of KMO=0 indicates that the sum of partial correlations is large relative to the sum of correlations, indicating diffusion in the pattern of correlations and therefore suggesting factor analysis is likely to be inappropriate.31 Similarly, a value close to 1.0 indicates that patterns of correlations are relatively compact and so factor analysis should yield distinct and reliable factors.31 There are several reference values for KMO. Kaiser58 proposed that a good factor-analytic data should get at least in the 0.80s and really excellent data is at the 0.90s. Others suggested values between 0.50-0.70 as mediocre, 0.70-0.80 as good, 0.80-0.90 as great and values above 0.90 as superb are used.59 For a sample size of 410, it is clearly shown that a value of KMO=0.940 with Bartlett’s test of sphericity showing p<0.05 confirms the appropriateness of performing factor analysis on this dataset.

Recommendations on factor loadings cut-off are various and largely silent on the appropriate minimum loading, leaving room for researchers to improvise.60 A loading of more than 0.30 is considered important32 with some researchers recommended to interpret only factor loadings with an absolute value greater than 0.40 61 while others considered a factor loading of 0.60 or more as “strong”.62 Some suggest in order to retain an item on a scale, the factor loading of the item should be higher than 0.30 and no higher loading on another factor63, while others maintain the test-retest procedure but forget factor analysis.64

CFA was performed upon 19 items and showed that almost all items were distinctively and significantly loaded into respective four factors. These four factors were: Attitudes, Subjective Norms, Perceived Behavioral Control and Expectations. Items P1, P2, P3 and P7 were loaded clearly into the first factor (Attitudes). These items were testing attitudes towards using pharmacy value-added services to collect monthly medicine and are referring to the degree a respondent has favorable or unfavorable evaluation of the behaviour. Of the 4 items in the Attitudes factor, P1, P2 and P3 have strong loadings of above 0.60 while P7 has a slightly lower factor loading at 0.438 but sufficiently to be retained as it is into the Attitudes domain. One reason for this phenomenon is that item P7 was a negatively phrased question. Some respondents might have overlooked the negative wording in item P7 while self-administering it.

The second factor, Subjective Norms (SN) contained item P4, P5 and P6 to explore patient’s perceived social pressure to use pharmacy value-added services. Factor loadings of these 3 items were moderate to strong (0.494-0.749). However, we found item P4 cross loading into the Attitudes factor because item P4 was sequenced immediately after item P3. The order effects of item P3 preceding P4 provided an attitudes context in respondents’ mind while attempting item P4. It is postulated that assimilation effects occur when responses to two questions are consistent and closer together due to their placement in the questionnaire. For the TPB model questionnaire, it is recommended that items be mixed up throughout the document with questions used to assess attitudes interspersed with questions measuring subjective norms and perceived behavioral control.28 For this reason, items were intermixed between constructs (factors). Conventionally it was recommended that items of the same topic and relevance should be grouped together and unfolded in a logical order.

Subsequently, P8, P9, P10, P12 and P13 loaded strongly into the 3rd factor (Perceived Behavioral Control). Item P8 was initially constructed to be included into the ATT factor but the analysis showed that it had loaded into the PBC factor. It was found that P8 had the lowest communalities among all items at 0.271 (extracted value). Since the cut-off value to retain an item is 0.30, P8 was excluded from the final analysis. Questions P11, P14 and P15 are dependent variables (Intention) which are not included in factor analysis and hence are not shown. To illustrate the nature of questions assessing the Intention construct, the three items pertaining the dependent variable are shown as below.

  • P11: I want to use pharmacy value-added services to collect my monthly medicine in the next 3 months.

  • P14: I am interested to use one of the pharmacy value-added services to collect my monthly medicine within the next 3 months.

  • P15: How far is your intention to use one of the pharmacy value-added services to collect your monthly medicine?

All items from E1 to E7 were loaded strongly (>0.70) into the fourth factor (Expectations). This is expected as these seven questions were distictively phrased with expectations wording and were arranged lastly in the questionnaire. The possibility of fatigue effects in respondents might have encouraged greater similar agreement or less differentiation responses among the items.

The validation process resulted in four significant expected factors with all items retained except item P8. The item rotation clearly showed that the utilization of a tested theoretical model to predict intention to use pharmacy value-added services by the Malaysian public was proven successful and established. Maintaining relevant questions and removing ambiguous data from the final analysis is suggested to focus the study into a more reliable analysis with stable results for future use of this instrument. Multiple regression with bootstrapping is also suggested to further predict patients’ intention to adopt pharmacy value-added services. The details of composite scoring and bootstraping are out of the scope of this paper. This we leave to future research.

Limitations

Randomization was not possible because electronic central database storing information of all patients who visit the facilities does not exist. The study can be further tested using a more diverse ethnic mix, both urban and rural settings as well as other states in Malaysia. However, this study is probably the first to establish reliability for the Malay version of PVASQ in this country. This contributes to the very first reliable and valid tool in measuring the intention to use pharmacy value-added services among Malaysian patients.

CONCLUSIONS

The results show that the PVASQ (Malay version) has good reliability and validity for assessing pharmacy value-added services adoption in Malaysia. The findings of this study therefore support the use of PVASQ in social pharmacy typically in Malaysian public health systems.

ACKNOWLEDGMENTS

The authors would like to thank the Director of Health Malaysia for permission to publish this paper. The authors would also like to express gratitude to all participants for their contribution and to all pharmacists in public pharmacy departments in Seremban state for their assistance. The authors would like to thank Mr Mohamad Adam Bujang from National Clinical Research Centre, Ministry of Health Malaysia, Malaysia for his useful advice.

References

Azmi M, Akmal SA, Chua G, eds. A national survey on the use of medicines (NSUM) by Malaysian consumers. Selangor: Quality Use of Medicines, Pharmaceutical Services Division, Ministry of Health Malaysia; 2013.

Ministry of Health Malaysia Malaysia. (2011). Annual Report Ministry of Health Malaysia. Malaysia: Ministry of Health Malaysia. Available at: http://www.moh.gov.my/images/gallery/publications/md/ar/2011_en.pdf (accessed July 7, 2015).

Lin YF, Lin YM, Sheng LH, Chien HY, Chang TJ, Zheng CM, Lu HP. First drive-thorugh pharmacy services in Taiwan. J Chin Med Assoc. 2013;76(1):37-41. doi: 10.1016/j.jcma.2012.10.001

McMillan SS1, Sav A, Kelly F, King MA, Whitty JA, Wheeler AJ. Is the pharmacy profession innovative enough?: meeting the needs of Australian residents with chronic conditions and their carers using the nominal group technique. BMC Health Serv Res. 2014;14:476. doi: 10.1186/1472-6963-14-476

Whitty JA, Kendall E, Sav A, Kelly F, McMillan SS, King MA, Wheeler AJ. Preferences for the delivery of community pharmacy services to help manage chronic conditions. Res Social Adm Pharm. 2015;11(2):197-215. doi: 10.1016/j.sapharm.2014.06.007

Valluri S, Seoane-Vazquez E, Rodriguez-Monguio R, Szeinbach SL. Drug utilization and cost in a Medicaid population: A simulation study of community vs. mail order pharmacy. BMC Health Serv Res. 2007;7:122.

Kirking DM, Ascione FJ, Richards JW. Choices in prescription-drug benefit programs: mail versus community pharmacy services. Milbank Q. 1990;68(1):29-51.

Johnsrud M, Lawson KA, Shepherd MD. Comparison of mail-order with community pharmacy in plan sponsor cost and member cost in two large pharmacy benefit plans. J Manag Care Pharm. 2007;13(2):122-134.

Carroll NV, Brusilovsky I, York B, Oscar R. Comparison of costs of community and mail service pharmacy. J Am Pharm Assoc (2003). 2005;45(3):336-343.

Zhang L, Zakharyan A, Stockl KM, Harada AS, Curtis BS, Solow BK. Mail-order pharmacy use and medication adherence among Medicare Part D beneficiaries with diabetes. J Med Econ. 2011;14(5):562-567. doi: 10.3111/13696998.2011.598200

Schmittdiel JA, Karter AJ, Dyer W, Parker M, Uratsu C, Chan J, Duru OK.The comparative effectiveness of mail order pharmacy use vs local pharmacy use on LDL-C control in new statin users. J Gen Intern Med. 2011;26(12):1396-1402. doi: 10.1007/s11606-011-1805-7

Duru OK, Schmittdiel JA, Dyer WT, Parker MM, Uratsu CS, Chan J, Karter AJ. Mail-order pharmacy use and adherence to diabetes-related medications. Am J Manag Care. 2010;16(1):33-40.

Devine S, Vlahiotis A, Sundar H. A comparison of diabetes medication adherence and healthcare costs in patients using mail order pharmacy and retail pharmacy. J Med Econ. 2010;13(2):203-211. doi: 10.3111/13696991003741801

Motheral BR, Heinle SM. Predictors of satisfaction of health plan members with prescription drug benefits. Am J Health Syst Pharm. 2004;61(10):1007-1014.

Johnson JA, Coons SJ, Hays RD, Sabers D, Jones P, Langley PC. A comparison of satisfaction with mail versus traditional pharmacy services. J Manag Care Pharm. 1997;3(3):327-337.

Ajzen I, Fishbein M. Understanding Attitudes and Predicting Social Behaviour. Englewood Cliffs, NJ: Prentice-Hall; 1980.

Fishbein M, Ajzen I. Belief, Attitude, Intention, and Behaviour: An Intoduction to Theory and Research. MA: Addison-Wesley; 1975.

Ajzen I. The Theory of Planned Behavior. Organiz Behav Human Decis Proc. 1991;50:179-211.

Ajzen I. From Intentions to Actions: A Theory of Planned Behaviour. In: Kuhl J, Beckmann J, eds. Action-control: From cognition to behavior. Heidelberg: Springer; 1985.

Ajzen I. The theory of planned behaviour: Reactions and reflections. Psychol Health. 2011;26(9):1113-1127. doi: 10.1080/08870446.2011.613995

McEachan RRC, Conner M, Taylor NJ, Jane LR. Prospective prediction of health-related behaviours with the Theory of Planned Behaviour: a meta-analysis. Health Psychol Rev. Sep 2011;5(2):97-144. doi: 10.1080/17437199.2010.521684

Sentosa I, Nik Mat NK. Examining a theory of planned behaviour (TPB) and technology acceptance model (TAM) in internetpurchasing using structural equation modeling. J Arts Sci Commerce. 2012;3(2):62-77.

Carmack CC, Lewis-Moss RK. Examining the theory of planned behavior applied to condom use: the effect-indicator vs. causal-indicator models. J Prim Prev. 2009;30(6):659-676. doi: 10.1007/s10935-009-0199-3

Courneya KS, Plotnikoff RC, Hotz SB, Birkett NJ. Social support and the theory of planned behaviour in the exercise domain. Am J Health Behav. 2000;24(4):300-308. doi: 10.5993/AJHB.24.4.6

Freeney Y, O'Connell M. The predictors of the intention to leave school early among a representative sample of Irish second-level students. Br Educ Res J. 2012;38(4):557-574. doi: 10.1080/01411926.2011.563838

Yoon C. Theory of planned behavior and ethics theory in digital piracy: an integrated model. J Business Ethics. 2011;100(3):405-417.

Montano D, Kasprzyk D. Theory of Reasoned Action, Theory of Planned Behavior, and The Integrated Behavioral Model. In: Glanz K, Rimer BK, K V, eds. Health Behavior And Health Education. 4th ed. San Francisco: Jossey-Bass; 2008.

Francis JJ, Eccles MP, Johnston M, Walker A, Grimshaw J, Foy R, Kaner EFS, Smith L, Bonetti D. Constructing questionnaires based on the Theory of Planned Behaviour: A manual for health services researchers. United Kingdom 2004. ISBN: 0-9540161-5-7

Sekaran U, Bougie R. Research methods for business: a skill-building approach. 6th ed: West Sussex: Wiley; 2013. ISBN: 978-1-119-94225-2.

Nichols DP. Choosing an intraclass correlation coefficient. SPSS Keywords. 1998.

Field A. Discovering Statistics Using SPSS. 2nd (and sex, drugs and rock 'n' roll) ed. London: Sage; 2005.

Statistics Solutions. Confirmatory Factor Analysis. 2013. Available at: http://www.statisticssolutions.com/academicsolutions/resources/directory-of-statistical-analyses/confirmatory-factor-analysis/ (accessed March 9, 2015).

Kim JO, Mueller CW. Factor analysis: statistical methods and practical issues. Newburry Park: Sage Publications; 1978.

Cronbach LJ. Coefficient alpha and the internal structure of tests. Psychometrika. 1951;16:297-334.

Brown JD. Likert items and scales of measurement? SHIKEN: JALT Testing Evaluation SIG Newsletter. March 2011;1:10-14.

Carifio J, Perla RJ. Ten common misunderstandings, misconceptions, persistent myths and urban legends about Likert scales and Likert response formats and their antidotes. J Soc Sci. 2007;3(3):106-116.

Pallant J. SPSS Survival Manual: A step by step guide to data analysis using SPSS. 4th ed. Australia: Allen Unwin; 2011.

Kline P. The handbook of psychological testing. 2nd ed. London: Routledge; 1999.

Cortina JM. What is coefficient alpha? An examination of theory and applications. J Appl Psychol. 1993;78:98-104.

Torres-Harding SR, Siers B, Olson BD. Development and psychometric evaluation of the Social Justice Scale (SJS). Am J Comm Psychol. 2012;50:77-88.

Kor K, Mullan BA. Sleep hygiene behaviours: an application of the theory of planned behaviour and the investigation of perceived autonomy support, past behaviour and response inhibition. Psychol Health. 2011;26(9):1208-1224. doi: 10.1080/08870446.2010.551210

Blackman NJM, Koval JJ. Estimating rater agreement in 2x2 tables: Correction for chance and intraclass correlation. Appl Psychol Meas. 1993;17(3):211-223.

Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960;20:37-46.

Fleiss JL, Cohen J. The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ Psychol Meas. 1973;33:613-619.

Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159-174.

Portney LG, Watkins MP. Foundations of clinical research: applications to practice. 2nd ed. Upper Saddle River, NJ: Prentice Hall Health; 2000.

Wilcox R. Modern statistics for the social behavioural sciences: a practical introduction. Los Angeles: Taylor Francis Group; 2012.

Jakobsson U, Westergren A. Statistical methods for assessing agreement for ordinal data. Scand J Caring Sci. 2005;19(4):427-431.

Sim J, Wright CC. The Kappa statistic in reliability studies: use, interpretation, and sample size requirements. Physical Therapy. 2005;85:257-268.

Spitzer RL, Cohen J, Fleiss JL, Endicott J. Quantification of agreement in psychiatric diagnosis. Arch Gen Psychiatry. 1967;17(1):83-87.

Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307-310.

Altman DG. Practical statistics for medical research. London: Chapman Hall; 1991.

Streiner DL, Norman GR. Health measurement scales. A practical guide to their development and use. 2nd ed: Oxford University Press, Oxford; 1995.

Gorsuch RL. Factor analysis. 2nd ed. Hillsdale, NJ: Erlbaum; 1983.

Nunnally JC. Psychometric theory. 2nd ed. New York: McGraw-Hill; 1978.

Everitt BS. Multivariate analysis: the need for data, and other problems. Br J Psychiatry. 1975;126:237-240.

Comrey AL, Lee HB. A first course in factor analysis. Hillsdale, NJ: Erlbaum; 1992.

Kaiser HF. A second-generation little Jiffy. Psychometrika. 1970;35:401-415.

Hutcheson G, Sofroniou N. The multivariate social scientist. London: Sage; 1999.

Norris M, Lecavalier L. Evaluating the use of exploratory factor analysis in development disability psychological research. J Autism Dev Disord. 2010;40(1):8-20. doi: 10.1007/s10803-009-0816-2

Stevens JP. Applied multivariate statistics for the social sciences. 2nd ed. Hillsdale, NJ: Erlbaum; 1992.

Marsh HW, Hau KT. Confirmatory factor analysis: strategies for small sample size. Statistical strategies for small sample research. Thousand Oaks, CA: Sage Publications; 1999.

Fabrigar LR, Wegener DT, MacCallum RC, Strahan EJ. Evaluating the use of exploratory factor analysis in psychological research. Psychol Methods. 1999;4(3):272-299.

Duarte-Silva D, Figueiras A, Herdeiro MT, Teixeira Rodrigues A, Silva Branco F, Polonia J, Figueiredo IV. PERSYVEDesign and validation of a questionanaire about adverse effects of antihypertensive drugs. Pharm Pract (Granada). 2014;12(2):396.

Notes

CONFLICT OF INTEREST

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. The usual disclaimer applies.

Funding: The study is a part of the corresponding author’s Master thesis in Social and Administrative Pharmacy, Universiti Sains Malaysia. The study scholarship is funded by Ministry of Health Malaysia.

Which measurement would a researcher use to test for reliability when the data are in dichotomous format?

Which measurement would a researcher use to test for reliability when the data are in dichotomous ("yes/no") format? The KR-20 coefficient is used to estimate the homogeneity of instruments.

What is the method to test the reliability of a measurement?

These four methods are the most common ways of measuring reliability for any empirical method or metric..
Inter-Rater Reliability. ... .
Test-Retest Reliability. ... .
Parallel Forms Reliability. ... .
Internal Consistency Reliability..

How do you determine reliability of data in research?

Reliability can be estimated by comparing different versions of the same measurement. Validity is harder to assess, but it can be estimated by comparing the results to other relevant data or theory. Methods of estimating reliability and validity are usually split up into different types.

What are the 3 ways of measuring reliability?

Reliability refers to the consistency of a measure. Psychologists consider three types of consistency: over time (test-retest reliability), across items (internal consistency), and across different researchers (inter-rater reliability).