Bug#523716: nvidia-kernel-source: fails to build

dEbian Bugs RC

Hello everyone,

180.44-2 still does not build against 2.6.28, but it does build fine
against 2.6.29, both debian stock kernels and headers.

linux-kbuild are homemade since they seem to be missing from unstable
and they are built according to:

http://wiki.debian.org/HowToRebuildAnOfficialDebianKernelPackage

Here is the output of:

# m-a a-i nvidia

for 2.6.28-1-amd64:

/usr/bin/make -C . LINUXDIR=/lib/modules/2.6.28-1-amd64/build
KVERREL=2.6.28-1-amd64 clean
make[1]: Entering directory `/usr/src/modules/nvidia-kernel'
rm -rf *.o *.ko .depend .*.flags .*.d .*.cmd *.mod.c .tmp_versions
make[1]: Leaving directory `/usr/src/modules/nvidia-kernel'
dh_clean
/usr/bin/make -f debian/rules kdist_clean kdist_config binary-modules
make[1]: Entering directory `/usr/src/modules/nvidia-kernel'
/usr/bin/make -C . LINUXDIR=/lib/modules/2.6.28-1-amd64/build
KVERREL=2.6.28-1-amd64 clean
make[2]: Entering directory `/usr/src/modules/nvidia-kernel'
rm -rf *.o *.ko .depend .*.flags .*.d .*.cmd *.mod.c .tmp_versions
make[2]: Leaving directory `/usr/src/modules/nvidia-kernel'
dh_clean
/usr/bin/make -w -f debian/rules configure
make[2]: Entering directory `/usr/src/modules/nvidia-kernel'
sed 's/#KVERS#/2.6.28-1-amd64/g' debian/control.template
sed 's/#KVERS#/2.6.28-1-amd64/g' debian/dirs.template
sed 's/#KVERS#/2.6.28-1-amd64/g' debian/override.template
make[2]: Leaving directory `/usr/src/modules/nvidia-kernel'
sed 's/#KVERS#/2.6.28-1-amd64/g' debian/control.template
sed 's/#KVERS#/2.6.28-1-amd64/g' debian/dirs.template
sed 's/#KVERS#/2.6.28-1-amd64/g' debian/override.template
dh_testroot
dh_prep
# Build the modules
/usr/bin/make -C . LINUXDIR=/lib/modules/2.6.28-1-amd64/build
KVERREL=2.6.28-1-amd64
make[2]: Entering directory `/usr/src/modules/nvidia-kernel'
make -C /lib/modules/2.6.28-1-amd64/build M=`/bin/pwd` modules
make[3]: Entering directory `/usr/src/linux-headers-2.6.28-1-amd64'
CC [M] /usr/src/modules/nvidia-kernel/nv.o
In file included from include/linux/bitops.h:17,
from include/linux/kernel.h:15,
from include/linux/sched.h:52,
from include/linux/utsname.h:35,
from /usr/src/modules/nvidia-kernel/nv-linux.h:19,
from /usr/src/modules/nvidia-kernel/nv.c:14:
/usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/bitops.h: In
function ‘set_bit’:
/usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/bitops.h:60:
warning: pointer of type ‘void *’ used in arithmetic
/usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/bitops.h: In
function ‘clear_bit’:
/usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/bitops.h:97:
warning: pointer of type ‘void *’ used in arithmetic
In file included from include/linux/utsname.h:35,
from /usr/src/modules/nvidia-kernel/nv-linux.h:19,
from /usr/src/modules/nvidia-kernel/nv.c:14:
include/linux/sched.h: In function ‘object_is_on_stack’:
include/linux/sched.h:2025: warning: pointer of type ‘void *’ used in
arithmetic
In file included
from /usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/dma-mapping.h:9,
from include/linux/dma-mapping.h:57,
from include/asm-generic/pci-dma-compat.h:7,

from /usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/pci.h:94,
from include/linux/pci.h:1002,
from /usr/src/modules/nvidia-kernel/nv-linux.h:86,
from /usr/src/modules/nvidia-kernel/nv.c:14:
include/linux/scatterlist.h: In function ‘sg_virt’:
include/linux/scatterlist.h:199: warning: pointer of type ‘void *’ used
in arithmetic
In file included
from /usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/hardirq_64.h:5,

from /usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/hardirq.h:4,
from include/linux/hardirq.h:7,
from include/linux/interrupt.h:12,
from /usr/src/modules/nvidia-kernel/nv-linux.h:87,
from /usr/src/modules/nvidia-kernel/nv.c:14:
include/linux/irq.h: In function ‘irq_to_desc’:
include/linux/irq.h:189: warning: comparison between signed and unsigned
In file included from /usr/src/modules/nvidia-kernel/nv-linux.h:113,
from /usr/src/modules/nvidia-kernel/nv.c:14:
include/linux/highmem.h: In function ‘zero_user_segments’:
include/linux/highmem.h:136: warning: pointer of type ‘void *’ used in
arithmetic
include/linux/highmem.h:139: warning: pointer of type ‘void *’ used in
arithmetic
In file included from include/linux/compat.h:14,

from /usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/mtrr.h:141,
from /usr/src/modules/nvidia-kernel/nv-linux.h:142,
from /usr/src/modules/nvidia-kernel/nv.c:14:
/usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/compat.h: In
function ‘compat_alloc_user_space’:
/usr/src/linux-headers-2.6.28-1-amd64/arch/x86/include/asm/compat.h:210:
warning: pointer of type ‘void *’ used in arithmetic
/usr/src/modules/nvidia-kernel/nv.c: In function ‘nv_kern_cpu_callback’:
/usr/src/modules/nvidia-kernel/nv.c:1265: error: too many arguments to
function ‘smp_call_function’
/usr/src/modules/nvidia-kernel/nv.c:1271: error: too many arguments to
function ‘smp_call_function’
make[4]: *** [/usr/src/modules/nvidia-kernel/nv.o] Error 1
make[3]: *** [_module_/usr/src/modules/nvidia-kernel] Error 2
make[3]: Leaving directory `/usr/src/linux-headers-2.6.28-1-amd64'
make[2]: *** [modules] Error 2
make[2]: Leaving directory `/usr/src/modules/nvidia-kernel'
make[1]: *** [binary-modules] Error 2
make[1]: Leaving directory `/usr/src/modules/nvidia-kernel'
make: *** [kdist_build] Error 2

Hope this helps.
Could you try changing the line in
/usr/src/modules/nvidia-kernel/conftest.h that says:

#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,29)

to

#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,27)

And then doing module-assistant again but using the -O option to prevent
it from reextracting?

Maybe I misread the kernel changes for one of these and got the version
where this function changed wrong. Right now I have it listed as 2.6.29
was the first kernel to use a 3 argument version of smp_call_function.
Perhaps it changed earlier than that, or perhaps one of the 2.6.28.x
versions changed it too. I did not consider the point releases of stable
kernels when doing the checks.

Now I just checked 2.6.28 again, and certainly a pure 2.6.28 expects
4 arguments to the function smp_call_function, which is what the code
currently does. Only 2.6.29, or a kernel with other patches could have
the 3 argument version of smp_call_function.

When you look at your kernel, what do you have in include/linux/smp.h
for the smp_call_function() definition?

2.6.29 has:
int smp_call_function(void(*func)(void *info), void *info, int wait);
So that is 3 arguments.

2.6.26 has:
int smp_call_function(void(*func)(void *info), void *info, int retry, int wait);
So that is 4 arguments.

2.6.28 from kernel.org has the same as 2.6.26, so it should be working.

Now I found some 2.6.27 and 2.6.28 debian headers on one of my machines,
and it appears debian patched something in 2.6.27 that gave it the
smp_call_function style of 2.6.29, even though official 2.6.27 and 2.6.28
don't have that. How amazingly annoying. So debian's kernels appear
to at times be incompatible with the kernel.org kernel, and at the same
time they have no taken away our ability to use compile tests to look for
which function call style to use, and at the same time we can't trust
the kernel version for testing since patches may completely change the
kernel interface. I can only see this to mean that there is now no way
to reliably make out of tree modules compile against debian's kernels
given the changes done in 2.6.29.

So if you use debian's 2.6.27 or 2.6.28, then it is broken, but then
again those two no longer exist in debian, so you probably should stop
using those. If you use 2.6.27 or 2.6.28 from kernel.org then they should
work fine (as long as you avoid any of debian's patches to the kernel).
Dear Len,

I will admit I have not understood half of what you wrote, however
modifying that line as you suggest and then launching:

[email protected]:~$ sudo m-a a-i -O -l 2.6.28-1-amd64 nvidia

built the package correctly.
I have not rebooted with 2.6.28 to test it, but I expect it to work.

I was holding back to 2.6.28 because of the nvidia module, now I can
safely switch full time to 2.6.29. However, I believe it is good to know
it's fixed.

Hope this helps, let me know if you need more testing.