Jump to content


Photo

logistic regression modeling


  • Please log in to reply
24 replies to this topic

#21 Arindam Pal1412845459

Arindam Pal1412845459

    Member

  • Members
  • PipPip
  • 12 posts
  • LocationIndia

Posted 15 September 2017 - 03:51 PM

When I copied in your model code I noticed immediately a basic syntax issue; in that you need to close brackets; if you look at the general tab you would have seen the errors noted in red too.

I haven't looked much furhter into you code but I did notice that there is some issue with the code/data still as you get this error in Warnings and errors when I try to execute.

Observable 'PID_score' category probabilities not descending

As I have to go into a meeting for the rest of the day I attach my project(v6.4) so far, you may need to confirm the mappings etc. and of course some initial estimates too.
Simon

PS Note this also being dealt with as Certara Support Case #00163890

Oh yes! I missed the bracket while quoting the model text here in the forum, when I looked back I saw the 2nd bracket was there in NLME.

So, bracket could not be an issue. 

I tried with initial estimates of coefficients, with larger the probability, lower the coefficients but still it didn't work.

Can "Observable 'PID_score' category probabilities not descending"  could be a problem?

But how? because with disease progression there's neither absolute pain reduction nor absolute worsening with time.

 

Regards

Arindam



#22 Simon Davis

Simon Davis

    Advanced Member

  • Administrators
  • 1,318 posts

Posted 15 September 2017 - 08:02 PM

I'm posting here on behalf of Serge;

 

This required a lot of changes but I made it.  The project is attached.

 

Many things to learn from it

 

1: To avoid that the cumulative probabilities will be < than the marginal(p54 must be >p5 for example), you need to parametrize differently as I did (see the code)

2: It is not a must but I used a trick to avoid that the optimizer would randomly get the cumulative P54 for example to be less than P5 (see the code below)

e1=(e1<(-10)?(-10):e1)

e2=(e2<(-10+0.000001)?(  -10+0.000001  ):e2)    

e3=(e3<(-10+0.000002)?(  -10+0.000002  ):e3)

e4=(e4<(-10+0.000003)?(  -10+0.000003  ):e4)    

 

If you use reasonable initial estimates this code is not necessary

 

Now you can see that CE go up to 140. Therefore you need to put for the product th6*Ce a value that is not too big (usually 0.1 to 1 is good). Therefore I put th6=0.001 to have the maximum product of 0.001 *140~0.2

Same trick for th5

 

I then completed the code for multi statement and LL statement and tried 2 completely different initial estimates. It converged immediately and no issue.

 

Note also I changed the scores that were for -1 to 3 to 0 to 4. This is needed when you use the multi statement (not the LL statement).

 

 

Hope it helps.

Best Regards

Serge

Attached Files



#23 Arindam Pal1412845459

Arindam Pal1412845459

    Member

  • Members
  • PipPip
  • 12 posts
  • LocationIndia

Posted 18 September 2017 - 05:46 AM

I'm posting here on behalf of Serge;

 

This required a lot of changes but I made it.  The project is attached.

 

Many things to learn from it

 

1: To avoid that the cumulative probabilities will be < than the marginal(p54 must be >p5 for example), you need to parametrize differently as I did (see the code)

2: It is not a must but I used a trick to avoid that the optimizer would randomly get the cumulative P54 for example to be less than P5 (see the code below)

e1=(e1<(-10)?(-10):e1)

e2=(e2<(-10+0.000001)?(  -10+0.000001  ):e2)    

e3=(e3<(-10+0.000002)?(  -10+0.000002  ):e3)

e4=(e4<(-10+0.000003)?(  -10+0.000003  ):e4)    

 

If you use reasonable initial estimates this code is not necessary

 

Now you can see that CE go up to 140. Therefore you need to put for the product th6*Ce a value that is not too big (usually 0.1 to 1 is good). Therefore I put th6=0.001 to have the maximum product of 0.001 *140~0.2

Same trick for th5

 

I then completed the code for multi statement and LL statement and tried 2 completely different initial estimates. It converged immediately and no issue.

 

Note also I changed the scores that were for -1 to 3 to 0 to 4. This is needed when you use the multi statement (not the LL statement).

 

 

Hope it helps.

Best Regards

Serge

Thank you Simon and Serge for your insightful guidance.

 

I realized that the code should have cumulative probabilities, the probabilities of transition from one score to the other over time.

Also, the role of optimizer is important to converge the model.

 

Parameterizing coefficients is important and I lack knowledge in that area;  are there any literature to follow to have some basic idea?

I know all MLL estimations are computer-intensive and can be done through MsExcel too but definitely a literature/slidedeck/lecture would have been helpful. 

 

Kind regards

Arindam



#24 Arindam Pal1412845459

Arindam Pal1412845459

    Member

  • Members
  • PipPip
  • 12 posts
  • LocationIndia

Posted 24 October 2017 - 08:28 AM

I was thinking as how it is different than hidden markov model (HMM), as both has probability of phase transitions accounted.

The data I used has pain relief scores rising from the baseline with max.score of 4 (no pain) and then transitions to 3 or 2 (mild to moderate pain) and even 1 (severe pain).

Any way the group would like to add to the knowledge of modelling this data in the form of HMM?

 

Thanks & best regards,

 

Arindam



#25 Arindam Pal1412845459

Arindam Pal1412845459

    Member

  • Members
  • PipPip
  • 12 posts
  • LocationIndia

Posted 21 November 2017 - 12:38 PM

Dear All,

 

I wanted to simulate PR (pain relief) scores of a different formulation (with different “Ce” values coming from sigmoidal Emax model) from the logit-modeled data we earlier used.

 

The problem is:

I cannot internally validate the same formulation (with same “Ce” values) with which the model was built. A change in PR score over time for this formulation is not reflecting on simulation.

While simulating I found that only one of the PR scores is found to be dominating throughout the overall timeframe of 24 hours.

That means the model is misspecified or there’s a problem in the code, which I tried to change with initial estimates and difference in coding but could not breakthrough.

Tried accommodating the changing pattern of categorical scores but with no luck, maybe the method is not correct.

 

Then, I looked into the earlier data and found the PID scores’ frequency in terms of score category is: 4<3<0<2<1 (More 1’s than 2’s than 0’s and so on…).

The PR scores’ frequency in terms of score category is: 3<4<2<0<1.

 

I assume this difference in PR scores has a bearing on initial estimates, on thetas. Tried changing the initial estimates but of no use.

 

I doubt the transition of probabilities from low to high scores and vice versa over time is not coded properly.

 

I have attached the project with the same coding used for PID (pain intensity difference) though (not reflecting the exercises I did) with the new PR score data.

 

Can any one kindly guide?

 

Kind regards,

Arindam 

Attached Files






2 user(s) are reading this topic

0 members, 2 guests, 0 anonymous users