Intel Speedstep under DOS

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Intel Speedstep under DOS

Zoltán Bacskó

Hi !

I experimented with Intel’s Speedstep under DOS yesterday.

I must say AMD’s Cool & Quiet solution is more DOS friendly :)

Reasons:

1.Voltage settings: the VID values are processor specific and not public.

The 16-bit encoding that defines valid operating points is model-specific. Applications and performance tools are not expected to use either IA32_PERF_CTL or IA32_PERF_STATUS and should treat both as reserved”

In contrast AMD’s VID values are part of the specification (CPU model independent coding scheme) and public.

2. But the main problem is the multicore logic of Speedstep. Transitions to lower performance states are only possible if All cores gets the low p-state values. If only one core gets the lower pstate values in its MSR then the CPU core registers the request to the targeted operating point but the transition won’t occur. Since under DOS only the bootstrap processor is available, you can only set higher p-states than the values configured in the BIOS. (BIOS always sets All cores to the same values). 

In contrast on AMD K8 it’s enough to set the MSR on one core and the other core automatically follows the new settings. In case of AMD K10 series the cores can work independently , so the other cores stays in the BIOS defined state, but the bootstrap core can be set freely.

 

To get to the point I have written a small utility that can adjust Speedstep p-states under DOS (with the above mentioned multicore restrictions). So if you would like to use it with a multicore speedstep capable Intel CPU you should set the multiplier and voltage to minimum in BIOS, in this way you can set all values freely in DOS. I suppose with a single core CPU (Pentium M, Core Solo) this is not required.

 

Any feedback is welcome.

 

The software can be downloaded from here (source is included):

http://falcosoft.hu/sstep.zip

 

 

Regards:

Falco

 

Ps: JEMM386 and UMBPCI is OK, but you cannot use this program if EMM386 is loaded.


------------------------------------------------------------------------------
CenturyLink Cloud: The Leader in Enterprise Cloud Services.
Learn Why More Businesses Are Choosing CenturyLink Cloud For
Critical Workloads, Development Environments & Everything In Between.
Get a Quote or Start a Free Trial Today.
http://pubads.g.doubleclick.net/gampad/clk?id=119420431&iu=/4140/ostg.clktrk
_______________________________________________
Freedos-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/freedos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Intel Speedstep under DOS

Zoltán Bacskó
Hi !

First and foremost thanks for the feedback.
I’m sorry if it was not clear. This is not a release, just an experiment. Its draft status can only change if I get some feedback about its usability and ‘universality’ among Speedstep enabled processors.  I could only test it with exactly 2 processors.

1.Intel Atom N455 1.66Ghz – 1 core 2 threads. FreeDos kernel 2041.
2. Intel Core 2 Duo E7500 2.93 Ghz 2 Cores. MS-DOS 7.1 (Win98 SE).

It worked in both cases, but only the single core Atom was adjustable freely (I mentioned the multicore problem previously). I don’t think it is kernel dependent.

I used this Intel document:
http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.html

The relevant part is the CHAPTER 14 - POWER AND THERMAL MANAGEMENT.
As I said before there were no exact descriptions about the necessary coding of the MSR’s.

My findings:
1. You must enable Speedstep by setting bit 16 IA32_MISC_ENABLE MSR (0x1A0)
2. IA32_PERF_STATUS MSR (0x198) holds the information about the current, minimum and maximum values.

Register EAX bits 0..7: the current VID. Processor specific and not documented. Possible values are in the 0 - 0x3F range. E.g. on the Core 2 Duo E7500 the range is 0x22 – 0x38 that is 1.1v – 1.3 v.
Register EAX bits 8..15: the current FID. Apparently this value (contrary to official Intel documentation) is not processor specific and is equivalent to the current multiplier.
Register EDX bits 0..7: The maximum VID value.
Register EDX bits 8..15: The maximum FID value (multiplier).
Register EDX bits 16..23: The minimum VID value.
Register EDX bits 24..31: The minimum FID value (multiplier).

3. IA32_PERF_CTL MSR (0x199) You can write this MSR to initiate p-state transitions.
Register EAX bits 0..7: the desired VID value.
Register EAX bits 8..15: the desired FID value.
When reading this MSR you can get the last targeted operating point’s values.

So the usage of the (draft status) program.

1. First run the program without any argument.
2. Look at the output. The program checks the Speedstep support by checking CPUID ECX feature bit 7. If this bit is not set then you get ‘CPU not supported’. No further things to do :) (You demanded the exact CPU /models/steppings that are supported. The essence is I don’t know and I need your help to determine this.)
3. If  Speedstep is supported then you get the current, and possible minimum, maximum values. So you can run the program with the necessary 2 arguments: FID(multiplier), VID (processor specific voltage id). You shuld try values given in the minimum/maximum range. If p-state transition succeeded then you should see the
LastTriedFid/Vid and CurrentFid/Vid values are equal. If not then likely the LastTriedFid/Vid values are the ones you defined, but CurrentFid/Vid are unchanged. This is the situation I described before as flawed multi core logic of speedstep. In this case you should lower your initial p-state in BIOS.

Yes, the source is currently TP6/7 compatible. But you can use the freeware tool TPC16  to compile it.
http://turbo51.com/compiler-design/tpc16-turbo-pascal-compiler-written-in-turbo-pascal

Any feedback is still welcome. Mainly from single core CPU users (Pentium M, Core Solo, Atom) to verify my theory that the multicore problem really affects only multiple core processors.

Regards:
falco




On Mon, Jan 27, 2014 at 11:02 PM, Rugxulo <[hidden email]> wrote:
Hi, (off-list, though feel free to forward back if you really care)

On 1/24/14, Zoltán Bacskó <[hidden email]> wrote:
>
> I experimented with Intel’s Speedstep under DOS yesterday.
>
> 2. But the main problem is the multicore logic of Speedstep. Transitions to
> lower performance states are only possible if All cores gets the low
> p-state values. If only one core gets the lower pstate values in its MSR
> then the CPU core registers the request to the targeted operating point but
> the transition won’t occur. Since under DOS only the bootstrap processor is
> available, you can only set higher p-states than the values configured in
> the BIOS. (BIOS always sets All cores to the same values).

I'd have to double-check, but I'm not sure my BIOS is that
configurable. Honestly, it's all very confusing to me. This particular
cpu (Nehalem Westmere) is supposed to support Turbo Boost, but I'm not
sure the Lenovo motherboard supports it! At least, I have no idea how
to enable it. Not that I really care, just saying, it's complicated!

> To get to the point I have written a small utility that can adjust
> Speedstep p-states under DOS (with the above mentioned multicore
> restrictions). So if you would like to use it with a multicore speedstep
> capable Intel CPU you should set the multiplier and voltage to minimum in
> BIOS, in this way you can set all values freely in DOS. I suppose with a
> single core CPU (Pentium M, Core Solo) this is not required.
>
> Any feedback is welcome.

Well, I don't have a lot of machines to test. Presumably this kind of
thing is more useful on laptops (slow the processor down to save
battery) than desktops (always plugged in). Yet ironically my Dell
(Intel) laptop has no APM (DOS-friendly, obsolete, everybody prefers
ACPI) support but my Lenovo (Intel) desktop does. (Both are roughly
from 2009 and 2010.)

So I can't use FDAPM very effectively on my laptop, thus any DOS use
is pretty much guaranteed to be short-lived since I can't even tell
how much battery I have left (until the white power light on the front
turns orange). For that reason (and others), I don't use DOS on that
laptop very much (and thus don't even have it natively installed, only
very very rarely use RUFUS via USB).

> The software can be downloaded from here (source is included):
>
> http://falcosoft.hu/sstep.zip

I did download this for quick inspection. I'm no engineer, so I'm
fairly useless technically. So most of this is just comments as an end
user:

1). What license? I see no text for it. Maybe you think it's too
trivial or too alpha / beta to worry right now. Maybe you wanted more
feedback first (and I've not tested it).

2). How to test it? First of all, what cpus are supported (family /
model / stepping)? Where are the docs? There's no readme.txt, not even
a pointer to an Intel .pdf URL nor wiki page. A few examples of use
(with mention of your specific processor) would be nice.

http://en.wikipedia.org/wiki/Intel_speed_step

3). You didn't mention which DOS (or kernel) you tested, nor cpu, nor
even which compiler was used to build this. Yes, I remember you don't
use FPC because you had trouble with it. So I'm blindly guessing
you're using TP7. Which is fine, I suppose (although I don't have it),
but just to be clear it might be nice to document all of that
explicitly. Honestly, it seems (almost) unnecessary to use Pascal +
inline asm at all here, just only use assembly (or WatcomC inline asm
or pragma aux, something with direct 386 support).

> Ps: JEMM386 and UMBPCI is OK, but you cannot use this program if EMM386 is
> loaded.

Presumably because of ring 0 needed, which JEMM works around / emulates for us.

BTW, I like Pascal (although I'm no expert or anything), but I can't
imagine FPC wouldn't work here too (in lieu of TP7). It supports 386
natively, so it doesn't need dumb opcode overrides. I can't imagine it
wouldn't be possible to get working since CWSDPMI has a ring 0 version
(CWSDPR0). Actually, even beyond that, there is (barely) an 8086 real
mode target for FPC nowadays (in unofficial 2.7.x, which will become
2.8.0 eventually), but it's very rough around the edges. See BTTR's
Forum for details about an unofficial snapshot (but I think we're more
likely to get 2.6.4 than 2.8.0 any time soon):

http://www.bttr-software.de/forum/board_entry.php?id=12985#p13014

I know that's not very important, I'm not asking you to waste your
time, just telling you for completeness.


------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Freedos-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/freedos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Intel Speedstep under DOS

Zoltán Bacskó
In reply to this post by Zoltán Bacskó



On Mon, Jan 27, 2014 at 11:02 PM, Rugxulo <[hidden email]> wrote:


BTW, I like Pascal (although I'm no expert or anything), but I can't
imagine FPC wouldn't work here too (in lieu of TP7). It supports 386
natively, so it doesn't need dumb opcode overrides. I can't imagine it
wouldn't be possible to get working since CWSDPMI has a ring 0 version
(CWSDPR0). Actually, even beyond that, there is (barely) an 8086 real
mode target for FPC nowadays (in unofficial 2.7.x, which will become
2.8.0 eventually), but it's very rough around the edges. See BTTR's
Forum for details about an unofficial snapshot (but I think we're more
likely to get 2.6.4 than 2.8.0 any time soon):

http://www.bttr-software.de/forum/board_entry.php?id=12985#p13014


Hi,
I have tried the real mode FPC (PP16), but unfortunately it cannot handle 32bit code in inline assembly properly.
E.g.
 mov eax,dword ptr [reax]
 becomes
 mov ecx,word [bp-4]
 in the intermediate assembly file so you get an
 error: mismatch in operand sizes
 from the compiler. 

So you can compile the source only with the same ' dumb opcode overrides' and the source level elegancy advantage disappears this way.
But you are right, the 32bit fpc compiler can be used. I have uploaded the modified package with fp32 source code. The package also contains the compiled protected mode binary (sstepp.exe) that is prelinked with ring0 stub CWSDSTR0.


------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Freedos-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/freedos-devel
Reply | Threaded
Open this post in threaded view
|

Re: Intel Speedstep under DOS

Rugxulo
In reply to this post by Zoltán Bacskó
Hi,

On Tue, Jan 28, 2014 at 5:39 AM, Zoltán Bacskó <[hidden email]> wrote:
>
> First and foremost thanks for the feedback.
> I'm sorry if it was not clear. This is not a release, just an experiment.
> Its draft status can only change if I get some feedback about its usability
> and 'universality' among Speedstep enabled processors.

Of course, I assumed as much. It just wasn't totally obvious how to
use it since there was no readme.

On this laptop, under Windows, the only two (default) power options
are "Balanced" and "Power Saver". The BIOS says minimum clock speed is
1.2 Ghz (vs. 2.2 Ghz).

> I could only test it with exactly 2 processors.
>
> 1.Intel Atom N455 1.66Ghz - 1 core 2 threads. FreeDos kernel 2041.
> 2. Intel Core 2 Duo E7500 2.93 Ghz 2 Cores. MS-DOS 7.1 (Win98 SE).
>
> It worked in both cases, but only the single core Atom was adjustable freely
> (I mentioned the multicore problem previously). I don't think it is kernel
> dependent.
>
> I used this Intel document:
> http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.html

I still have not yet read this. Just FYI!   :-)    Probably too
technical for me anyways.

> The relevant part is the CHAPTER 14 - POWER AND THERMAL MANAGEMENT.
> As I said before there were no exact descriptions about the necessary coding
> of the MSR's.
>
> My findings:
> (... snip, confusing! ...)
>
> So the usage of the (draft status) program.
>
> 1. First run the program without any argument.

Both 16-bit and 32-bit versions returned the same value here (see
below). I didn't test the 32-bit one any further, just assumed 16-bit
was equivalent and good enough for now.

> 2. Look at the output. The program checks the Speedstep support by checking
> CPUID ECX feature bit 7. If this bit is not set then you get 'CPU not
> supported'. No further things to do :) (You demanded the exact CPU
> /models/steppings that are supported. The essence is I don't know and I need
> your help to determine this.)

I don't know either without aimlessly searching the Internet.

My Dell (f/m/s = 6 / 7 / a) laptop's BIOS (A13?) had an explicit
setting for enabling/disabling Intel SpeedStep (under "Battery",
IIRC), and it was already enabled. I didn't see anything about
"lowering p states". The only Power Management stuff was related to
Wake USB and Wake on LAN.

Okay, this is a Dell Inspiron 1545 laptop calling itself "Pentium(R)
Dual-Core CPU       T4400  @ 2.20GHz_" (sigh, yes, horribly misleading
name, but anyways). Wikipedia calls this "Core based Pentium", aka
Penryn-3M or Penryn-L (45nm). They say it has "Enhanced Intel
SpeedStep Technology (EIST)".

However, as you implied, here it doesn't seem to work under DOS (with
multiple cores).

=====================================
G:\TONY>sstep

Speedstep 1.0 by Falcosoft
usage: sstep [multiplier] [voltageid] -without parameters shows CPU info

LastTriedFid: 11
LastTriedVid: 35
CurrentFid: 11
CurrentVid: 35
MaxFid: 11
MaxVid: 35
MinFid: 6
MinVid: 31
=====================================

Trying "sstep 6 31" didn't work (LastTried was changed but not
Current). So I guess by default it's just always max clock speed
(worse battery life) under non-ACPI OSes (DOS).

> 3. If  Speedstep is supported then you get the current, and possible
> minimum, maximum values. So you can run the program with the necessary 2
> arguments: FID(multiplier), VID (processor specific voltage id). You shuld
> try values given in the minimum/maximum range. If p-state transition
> succeeded then you should see the
> LastTriedFid/Vid and CurrentFid/Vid values are equal. If not then likely the
> LastTriedFid/Vid values are the ones you defined, but CurrentFid/Vid are
> unchanged. This is the situation I described before as flawed multi core
> logic of speedstep. In this case you should lower your initial p-state in
> BIOS.

I see no way to lower "p states". It just isn't supported here, by
default anyways.

> Any feedback is still welcome. Mainly from single core CPU users (Pentium M,
> Core Solo, Atom) to verify my theory that the multicore problem really
> affects only multiple core processors.

My main P4 is disconnected and not within reach, and I don't have any
other recent (single core) cpus. I'm not sure even that would support
SpeedStep, that PC is comparatively old by now (2002).

Battery life is quite a mixed bag on this laptop. I don't know what
helps or hurts. Normal light use lasts a few hours, but using Flash or
running Eternity (Win32) + FreeDoom seems to eat up 2x quicker.
(Presumably graphical-intensive stuff can't help it, which I usually
avoid unless really bored.)

I have never really benchmarked in DOS how long the battery lasts,
e.g. with FDAPM (but no APM in BIOS, doh) or without. So I really, for
good or bad, just don't use DOS much on this particular machine. (No
VT-X either. Sure, I could probably run Fedora liveUSB with DOSEMU,
but last time it just mysteriously hosed itself. Nah, I just use my
other PC [main desktop] with native FreeDOS. My RUFUS USB is only for
larks on rare occasion like this, heh.)

------------------------------------------------------------------------------
WatchGuard Dimension instantly turns raw network data into actionable
security intelligence. It gives you real-time visual feedback on key
security issues and trends.  Skip the complicated setup - simply import
a virtual appliance and go from zero to informed in seconds.
http://pubads.g.doubleclick.net/gampad/clk?id=123612991&iu=/4140/ostg.clktrk
_______________________________________________
Freedos-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/freedos-devel