Thursday, 2019-08-15

*** linuxjacques <linuxjacques!~jacques@nslu2-linux/jacques> has quit IRC00:07
*** linuxjacques <linuxjacques!~jacques@nslu2-linux/jacques> has joined #yocto00:08
*** moto-timo <moto-timo!~ttorling@fsf/member/moto-timo> has quit IRC00:46
*** cp <cp!> has quit IRC01:22
*** cp <cp!> has joined #yocto01:24
*** chinhuat6 <chinhuat6!~chinhuat@> has joined #yocto02:04
*** chinhuat <chinhuat!~chinhuat@> has quit IRC02:05
*** georgem_home <georgem_home!uid210681@gateway/web/> has quit IRC03:06
*** learningc <learningc!> has joined #yocto03:19
*** behanw <behanw!uid110099@gateway/web/> has joined #yocto03:40
*** learningc <learningc!> has quit IRC04:10
*** learningc <learningc!> has joined #yocto04:11
*** Dvorkin <Dvorkin!~Dvorkin@> has quit IRC04:26
*** Bunio_FH <Bunio_FH!> has quit IRC04:38
*** AndersD <AndersD!> has joined #yocto05:27
*** asabil <asabil!> has joined #yocto06:10
*** jeanba <jeanba!~jbl@> has joined #yocto06:16
*** jeanba <jeanba!~jbl@> has left #yocto06:16
*** TobSnyder <TobSnyder!> has joined #yocto06:19
*** chinhuat69 <chinhuat69!~chinhuat@> has joined #yocto06:25
*** chinhuat69 is now known as chinhuat06:26
*** chinhuat6 <chinhuat6!~chinhuat@> has quit IRC06:27
*** agust <agust!> has joined #yocto06:31
*** asabil <asabil!> has quit IRC06:54
*** asabil <asabil!> has joined #yocto06:55
*** jmiehe <jmiehe!> has joined #yocto06:59
*** T_UNIX <T_UNIX!uid218288@gateway/web/> has joined #yocto07:26
*** Bunio_FH <Bunio_FH!> has joined #yocto07:46
*** learningc <learningc!> has quit IRC08:01
*** learningc <learningc!> has joined #yocto08:03
*** florian <florian!~florian_k@Maemo/community/contributor/florian> has joined #yocto08:13
*** qt-x <qt-x!50614037@> has joined #yocto08:21
qt-xCan a recipe be made to build multile images ? eg. bitbake multi-image08:24
qt-xwhere instructs to build 3 or more images08:26
RPqt-x: yes, just have them as dependencies do_sometask[depends] = "image1:do_image_complete image2:do_image_complete2 imag3_do_image_complete"08:28
SaurRP: Regarding that ResourceWarning I mentioned yesterday, I am building without the hash server enabled.08:30
qt-xawesome thanks RP08:30
*** jeanba <jeanba!~jbl@> has joined #yocto08:37
*** jeanba <jeanba!~jbl@> has left #yocto08:37
*** florian <florian!~florian_k@Maemo/community/contributor/florian> has quit IRC08:38
*** gaulishcoin <gaulishcoin!> has joined #yocto08:41
*** alimon <alimon!alimon@gateway/shell/linaro/x-muzwthdlgbmewyai> has quit IRC08:41
*** jofr <jofr!~jofr@> has quit IRC08:47
*** jofr <jofr!~jofr@> has joined #yocto08:48
asabilHi everyone08:49
asabilI was wondering why when an initramfs is built and bundled with bitbake, the kernel with the bundled initramfs is not actually packages08:50
asabiland only left laying around in the build tree08:50
yoctiNew news from stackoverflow: /dev/fd/ socket or pipe links fail, NOT missing /dev/fd link <>08:56
*** BCMM <BCMM!~BCMM@unaffiliated/bcmm> has joined #yocto08:56
*** florian <florian!~florian_k@Maemo/community/contributor/florian> has joined #yocto09:07
*** gaulishcoin <gaulishcoin!> has quit IRC09:07
asabilIn my case Image.gz.initramfs is created but only Image.gz is packaged09:11
asabilThis is inside kernel.bbclass09:12
*** asabil <asabil!> has quit IRC09:18
*** asabil <asabil!> has joined #yocto09:19
*** edgar444 <edgar444!uid214381@gateway/web/> has joined #yocto09:28
*** learningc <learningc!> has quit IRC09:45
*** opennandra <opennandra!> has joined #yocto09:45
*** opennandra <opennandra!> has quit IRC09:56
*** learningc <learningc!~learningc@> has joined #yocto10:06
*** bluelightning_ <bluelightning_!~paul@pdpc/supporter/professional/bluelightning> has joined #yocto10:10
*** bluelightning <bluelightning!~paul@pdpc/supporter/professional/bluelightning> has quit IRC10:15
*** bluelightning_ <bluelightning_!~paul@pdpc/supporter/professional/bluelightning> has quit IRC10:31
*** dmoseley_ <dmoseley_!> has quit IRC10:34
*** bluelightning_ <bluelightning_!~paul@pdpc/supporter/professional/bluelightning> has joined #yocto10:35
*** vmeson <vmeson!> has quit IRC10:56
*** goliath <goliath!> has joined #yocto10:58
*** asabil <asabil!> has quit IRC11:02
*** asabil <asabil!> has joined #yocto11:03
*** pung_ <pung_!~BobPungar@> has joined #yocto11:16
*** BobPungartnik <BobPungartnik!~BobPungar@> has quit IRC11:21
*** blueness <blueness!~blueness@gentoo/developer/blueness> has quit IRC11:22
*** kroon <kroon!~kroon@> has joined #yocto11:24
Crofton|workscary thig to see in the morning!11:33
*** berton <berton!~berton@> has joined #yocto11:43
*** Chrusel <Chrusel!c1669b04@> has joined #yocto12:23
asabilDoes anyone have any input regarding the question I asked earlier?12:28
RPasabil: initramfs depends on a lot of different things, some images include them, some don't. It could also vary depending on the target architecture, kernel and machine. Basically it depends on a lot of different things12:29
asabilRP: yes I know, my problem is with the kernel.bbclass12:30
asabilit generates <Image>.initramfs and then forgets about it12:30
asabilit feels like a bug to me, but I am not sure12:30
*** Dvorkin <Dvorkin!~Dvorkin@> has joined #yocto12:31
DvorkinHow to overwrite default Kconfig vaue using meta?12:32
*** opennandra <opennandra!> has joined #yocto12:32
opennandraI have vendor SDK which have heavy patched u-boot + kernel + rootfs12:32
asabiland then this is where it specifies the Package files
opennandraI plan to use u-boot and kernel and move building rest to yocto12:33
opennandraproject using 4.9. linaro toolchain12:33
opennandramy target is poky rocko12:33
asabilnothing ever refers to $imageType.initramfs12:33
opennandrabut I'm having some issues like: ERROR: gmp-6.1.2-r0 do_package_qa: QA Issue: libgmpxx rdepends on external-linaro-toolchain-dbg [debug-deps]12:34
opennandraERROR: gmp-6.1.2-r0 do_package_qa: QA Issue: /usr/lib/ contained in package libgmpxx requires, but no providers found in RDEPENDS_libgmpxx? [file-rdeps]12:34
opennandraERROR: gmp-6.1.2-r0 do_package_qa: QA Issue: /usr/lib/ contained in package libgmpxx requires, but no providers found in RDEPENDS_libgmpxx? [file-rdeps]12:34
opennandraERROR: gmp-6.1.2-r0 do_package_qa: QA Issue: /usr/lib/ contained in package libgmpxx requires, but no providers found in RDEPENDS_libgmpxx? [file-rdeps]12:34
opennandraERROR: gmp-6.1.2-r0 do_package_qa: QA Issue: /usr/lib/ contained in package libgmpxx requires, but no providers found in RDEPENDS_libgmpxx? [file-rdeps]12:34
opennandraERROR: gmp-6.1.2-r0 do_package_qa: QA Issue: /usr/lib/ contained in package libgmpxx requires, but no providers found in RDEPENDS_libgmpxx? [file-rdeps]12:34
opennandrait is even good idea to try it like that?12:34
*** georgem_home <georgem_home!uid210681@gateway/web/> has joined #yocto12:36
RPopennandra: the shlibs automatic dependency code is confused as it can't figure out what should be providing libstdc++12:38
RPopennandra: a question for the supplier of this external toolchain12:38
opennandraRP: looks like it's QA issue so can I just suppress it and try?12:39
opennandradoes this thing even make sense?12:39
*** georgem <georgem!~georgem@> has quit IRC12:46
jwesselRP:  I have some more information about the glibc-locale pseudo issue, but I still don't really understand the nature of the problem yet.12:46
*** asabil <asabil!> has quit IRC12:47
*** georgem <georgem!~georgem@> has joined #yocto12:47
jwesselIt took 11 hours to get an instrumented pseudo to reproduce the problem.  It just failed a bit ago.12:47
*** asabil <asabil!> has joined #yocto12:47
jwesselI don't know how much you looked at it, but there is a filter log with the important bits stripped out.12:47
jwesselI am trying to understand the sequence of events that lead up to pseudo to decide the uid is wrong.12:49
*** qt-x <qt-x!50614037@> has quit IRC12:55
*** kaspter <kaspter!~Instantbi@> has quit IRC12:59
RPopennandra: Its a warning that something is seriously wrong with your build. Sure you can turn it off but that doesn't fix it.13:12
RPjwessel: interesting. Let me see if I can page in from swap :)13:13
opennandraRP: ok what is then sugeested way ? Use some older yoct orelease?13:13
opennandraRp: I need to build stuff with external toolchain13:13
*** JPEW <JPEW!cc4da337@> has joined #yocto13:15
*** asabil <asabil!> has quit IRC13:16
RPopennandra: Fix the external toolchain, or as the supplier of the external toolchain why its broken. I know nothing about it so I can't really help. I know what that error means but I don't know what the correct fix is or anything about the toolchain. I do know that turning off the check will just make it fail later13:17
RPjwessel: the other interesting thing is "mode 100600" - where did that come from...13:18
opennandraRP: ok thanks a lot, I'll ask on mailing list then13:18
RPjwessel: I've always thought that there was stale data in pseudo's db that by chance happens to corrupt a new file13:19
RPjwessel: If I was right about that, the question is where did the bad info come from (stale inode?)13:19
RPjwessel: how complete are your logs? can you tell if that inode has any previous history with an unrelated file?13:20
SaurRP: Is there some way to tell bitbake to copy files from SSTATE_MIRRORS rather than creating symbolic links to them? In our case we have the global sstate cache on an NFS mount and I would prefer to copy the files to the local sstate cache rather than having to retrieve them via NFS each time.13:22
*** vmeson <vmeson!> has joined #yocto13:24
*** nabokov <nabokov!~armand@> has joined #yocto13:25
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has joined #yocto13:28
*** bluelightning_ <bluelightning_!~paul@pdpc/supporter/professional/bluelightning> has quit IRC13:30
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has quit IRC13:30
RPSaur: you'd have to tweak the fetcher code afaik13:31
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has joined #yocto13:31
SaurRP: Ok. Would it be acceptable to add a way to force copying instead of linking? Either globally, or perhaps some way to do it per URL in SSTATE_MIRRORS?13:33
RPSaur: My reservation is just about more codepaths combinations in the fetcher code :(13:33
RPper url in sstate_mirrors sounds like some kind of nightmare13:34
SaurProbably not too easy to add either. Globally should probably be a lot easier.13:34
SaurOk, I'll have a look at the code and see what it would involve.13:35
RPSaur: The trouble is each time you add a binary "yes/no" decision for a feature like that into the fetcher, it doubles our test matric13:36
RPmatirx. Given the people we have maintaining it (or not), I'm rather adverse to such controls13:37
SaurRP: Yeah, I know. At the same time having the symbolic links to an NFS mount is less than optimal. Especially when there is network failure and the NFS mount is gone for a day due to IT not being able to get the network working (yes, we had that the other day) :P13:38
JPEWSaur: You might be able to expose it over HTTP also?13:39
SaurJPEW: Yeah, that is an alternative, but then it becomes a matter of authentication... With the NFS mount that is taken care of by who can mount it...13:40
JPEWSaur: Ah, ya that gets a little tricky13:40
*** kaspter <kaspter!~Instantbi@> has joined #yocto13:40
RPJPEW: btw, I found another data corruption bug in runqueue. Hoping this explains a few weird things!13:42
JPEWRP: Cool. Did you get any stats from the AB?13:42
RPSaur: Its a tricky one. Once I accept such a patch we're stuck trying to maintain that API effectively indefinitely though :/13:42
RPJPEW: {"connections": {"total_time": 2058.1817467394285, "max_time": 0.2676054770126939, "num": 1772291, "average": 0.0011613114024386676, "stdev": 0.003402929594519231}, "requests": {"total_time": 1224.0615269057453, "max_time": 0.26773543702438474, "num": 1772290, "average": 0.0006906666103773904, "stdev": 0.0005487492249695723}}13:43
RPJPEW: that was after a single build approximately completed13:43
JPEWRP: Any connection timeouts?13:43
RPJPEW: loads13:43
RPJPEW: - any warning is a timeout13:44
JPEWHmm. Ok, that's unfortunate. It means my stats probably aren't capturing where the timeout happens :(13:44
SaurRP: Yet without the possible to add those kinds of tweaks, my hands are very limited as I have no way of doing local modifications to bitbake, compared to classes and recipes that I can copy/append to locally.13:44
RPJPEW: I think it just means the server can't handle enough requests to stop some connections stalling13:44
JPEWi.e. the Kernel can't accept anymore connections? Ya that seems likely13:45
JPEW1772290 * (0.0011 + 0.0007) = 3190 seconds13:46
RPSaur: you can monkey patch bitbake. You'd just not *like* to do that13:46
SaurNo, I definitely don't like the idea of doing that...13:47
JPEWRP: So, I think we can say that connections are being serviced in a reasonably timely fashion once if the kernel actually allows them13:48
jwesselRP: That was included in the log I posted.13:48
jwesselThat particular inode was used earlier, but it was still good at the time.13:49
jwesselI am not exactly sure if it is DB corruption, or some kind of a odd race condition.13:51
jwesselI suspect I'll have to add additional logging information but I am not sure what to add yet.13:51
jwesselThis is the first time we have caught it "red handed" so to speak at the first time the bad entry is inputed into the DB.13:52
jwesselI'd like to be able to create a stand alone test that does that emits the same kind of log entries.13:54
jwesselWhat I don't know is if the client 1/2 of the operation is where things went bad.  This is only the server side.  I wasn't sure if the client picks up the UID info and just passes it along, or if the server is making some kind of decision.  To me it looked like a brand new entry.13:55
RPjwessel: this is the challenge with debugging this, its very hard to tell13:57
jwesselI'll have to go read some more code and such.  I have only been working on this intermittently, so I thought I might post what I had.13:57
RPJPEW: I guess we need to more threads in parallel answering the connections?13:58
jwesselWhat we know now definitively, is that it is a re-used inode and it has something to do with the hard links and mv operations.13:58
RPjwessel: its useful, I'm just also unfortunately in the middle of a complex mess with runqueue :(13:58
RPjwessel: yes13:59
jwesselI am in the middle of 2 or 3 other things myself :-)13:59
JPEWYa, I think we should make the server use the socketserver.ThreadMixIn class to thread the server, then make the siggen code use a persistent connection13:59
RPJPEW: doesn't that code use a thread per connection?13:59
RPJPEW: I'm not really willing to go that far, bit risky14:00
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has quit IRC14:00
JPEWIf you want to share a persistent connection, it makes more sense because then it won't run out of threads to handle new connections (or you need a thread pool that is bounded by the maximum number of clients you expect at any given time)14:01
*** TobSnyder <TobSnyder!> has quit IRC14:01
RPJPEW: I suspect the current design can handle persistent connections, its the shear number which is overloading it14:02
RPJPEW: I suspect a thread pool may be easier than persisting though :/14:02
JPEWRP: Not with one thread.... the single thread will handle only one connection until it closes, so it would block all others14:03
*** opennandra <opennandra!> has quit IRC14:04
*** tijko <tijko!~tijko@unaffiliated/tijko> has joined #yocto14:04
JPEWRP: There are easy ways to make the connection persistent, just not with stock python modules.14:04
RPJPEW: hmm, I thought the thread worked differently to that :/14:05
JPEWRP: let me look, I might be confusing something14:06
JPEWRp: OK, I was right. The HTTP handler base class processes request on the connection until it closes:
JPEWYou could, I supposes pass *those* all off to yet another thread, but that seems messy14:09
RPJPEW: I think we're seeing it differently in that we're thinking about different threads14:09
RPJPEW: I'm looking at it from the perspective of the sockets being opened by the server. There is a thread dedicated to doing that and queuing them up which isn't blocked on closing14:10
*** FailDev <FailDev!18d83107@> has joined #yocto14:10
*** kaspter <kaspter!~Instantbi@> has quit IRC14:12
*** kaspter <kaspter!~Instantbi@> has joined #yocto14:13
JPEWRP: Correct... I was running ahead and trying to think about persistent connections and threads. I don't really see how adding more threads would help with non-persistent connections?14:14
*** armpit <armpit!~armpit@2601:202:4180:c33:bf:ea8b:b284:1e7e> has quit IRC14:14
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has joined #yocto14:14
RPJPEW: it depends where our bottleneck is and I guess on that we're still perhaps not quite in agreement14:15
JPEWRP: Ok, right. I think (assuming my stat code is correct) that the server handles requests timely once it actually accepts() them. The one metric we don't have is the amount of time a connection is pending in the kernel before userspace calls accept() to get it.14:19
JPEWSo either 1) The connections are waiting for a long period of time in the listen queue before userspace calls accept() them14:20
RPJPEW: think about this maths, We have 40 different autobuilder targets each with 9000 tasks starting in parallel14:20
RPJPEW: that means 360,000 requests approximately in parallel which we need to answer in less than two minutes. Can the server do that?14:21
RPJPEW: with the average connection time it would take 418s, the average requests time, 248s, both of which are > 120s14:22
RPJPEW: so I suspect even the request time is too slow :/14:23
RP(I'm open to persuasion I'm missing something)14:23
*** Chrusel <Chrusel!c1669b04@> has quit IRC14:24
JPEWRP: You are correct. I suspect adding one more thread to the pool would cut the request time almost in half... the request time includes the I/O time required to read the data from TCP socket as well as write the results back.14:24
*** tijko <tijko!~tijko@unaffiliated/tijko> has quit IRC14:25
*** armpit <armpit!~armpit@2601:202:4180:c33:7c98:5faa:262d:a3af> has joined #yocto14:25
RPJPEW: right, we probably ideally want a pool of around 5-1014:28
*** tijko <tijko!~tijko@unaffiliated/tijko> has joined #yocto14:28
*** dreyna <dreyna!> has joined #yocto14:28
JPEWRP: I think doing that will significantly reduce the request time14:28
JPEWRP: But the connect time on it's own is still too long.... I suppose it's possible the reduction in request time would also reduce the connection time14:30
JPEWRP: Which actually seems pretty likely. The connect time can't *possibly* be shorter (on average) than the request time if the server is running full tilt with a single request handling trhead14:31
*** JaMa <JaMa!> has joined #yocto14:37
JaMaeither my builds are much bigger or bitbake on one of most powerful servers I've access to is still slower than what people use, will send the parsing times to ML "shortly" last test running on No currently running tasks (28875 of 71749)14:40
*** FailDev <FailDev!18d83107@> has quit IRC14:45
*** Bunio_FH <Bunio_FH!> has quit IRC14:54
*** kaspter <kaspter!~Instantbi@> has quit IRC14:57
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has quit IRC15:00
*** asabil <asabil!~asabil@2a01:79d:7375:2ca4:fd3d:19c0:f6c1:5aea> has joined #yocto15:00
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has joined #yocto15:02
*** FailDev <FailDev!18d83107@> has joined #yocto15:03
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has quit IRC15:06
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has joined #yocto15:08
*** kaspter <kaspter!~Instantbi@> has joined #yocto15:16
*** edgar444 <edgar444!uid214381@gateway/web/> has quit IRC15:17
*** asabil <asabil!~asabil@2a01:79d:7375:2ca4:fd3d:19c0:f6c1:5aea> has quit IRC15:18
*** kroon <kroon!~kroon@> has quit IRC15:23
RPJaMa: or you're hitting the same inotify issue that kanavin's profile seemed to show...15:35
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has quit IRC15:52
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has joined #yocto15:59
*** goliath <goliath!> has quit IRC16:02
JaMaRP: yes, we'll see -P is running now16:05
JaMaRP: those "ResourceWarning: unclosed" warnings are about the socket of PRServer on localhost, now I see in some builds that it even started PRServer twice16:08
JaMaNOTE: Started PRServer with DBfile: prserv.sqlite3, IP:, PORT: 44707, PID: 394716:08
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has quit IRC16:08
JaMaNOTE: Terminating PRServer...16:08
JaMaNOTE: Started PRServer with prserv.sqlite3, IP:, PORT: 42189, PID: 394916:08
JaMabb/ ResourceWarning: unclosed <socket.socket fd=10, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('', 44707)> "while_clause": lambda x: (chain(x.condition, x.cmds), None),16:08
JaMaand it's triggered from various places (not just
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has joined #yocto16:09
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has quit IRC16:11
*** jmiehe <jmiehe!> has quit IRC16:12
JaMaRP: if I return at "Executing tasks", then it took only 6mins and there is only one notification in the profile (21 config_notification)16:16
RPJaMa: so its the iterating through the tasks part that is slow for you?16:18
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has joined #yocto16:21
RPJaMa: I think kanavin's numbers were Ctrl+C before exeuting tasks16:22
JaMaRP: looks like it, will let it run whole build with -n (with latest master-next it was "only" 90 mins)16:22
RPJaMa: that is *way* too long16:22
RPJaMa: did you have the -P output for that?16:23
JaMaor should I move the return to some better place? I'm looking at the execute() function but don't see where it would make most sense16:23
JaMano, I have -P output only for this short 6min part16:24
JaMa90mins is *way* too long, but with older master-next it was over 10 hours, so it's nice improvement :)16:27
RPJaMa: Right, its better. What was it before we started messing with runqueue?16:27
JaMabefore messing with runqueue (bitbake 1f630fdf0260db08541d3ca9f25f852931c19905) it is over 4 hours16:28
RPJaMa: so we did actually get better, its still just slow16:29
JaMaI can try even older revision, but on small sample (core-image-minimal) this revision was the fast baseline16:29
*** goliath <goliath!> has joined #yocto16:29
JaMaor something is messing with my benchmark like those PRserver connections16:30
JaMawill try to disable PRserv as well16:30
RPJaMa: I tried a "bitbake -n world" for poky so effectively oe-core and it takes 2m5016:35
*** vineela <vineela!vtummala@nat/intel/x-hjjhjvtldyngfbkt> has joined #yocto16:35
dkcHey, i'm trying to implement something similar to that:
RPJaMa: appears to be around 200 tasks/second (12044 in total)16:35
dkcbut the variable with the layer revision seems to never be updated16:35
JaMafor me it was about 10 tasks/second, but most of the time was spent between "Executing tasks" message and the next line "Executing task (1 from 71749)"16:38
JaMaRP: I should also note that I had BB_NUMBER_THREADS = "8" on this 72threads machine16:40
JaMamost of the time it looks like this:16:42
JaMa  PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command16:43
JaMa28719 mjansa     20   0 1328M 1265M  7096 R 100.  0.3  4:03.05 ├─ python3 /bitbake/bin/bitbake world -P -n16:43
JaMa28888 mjansa     20   0  496M  370M  8856 S  0.0  0.1  0:01.91 │  └─ python3 /bitbake/bin/bitbake-worker decafbadbad16:43
JaMa28889 mjansa     20   0  496M  370M  8856 S  0.0  0.1  0:00.01 │     └─ python3 /bitbake/bin/bitbake-worker decafbadbad16:43
JaMa28712 mjansa     20   0  164M 29380  9532 S 40.7  0.0  0:10.01 │  │  └─ python3 /bitbake/bin/bitbake world -P -n16:43
JaMa28718 mjansa     20   0  164M 29380  9532 R 17.7  0.0  0:04.14 │  │     └─ python3 /bitbake/bin/bitbake world -P -n16:43
JaMaand only the first one being busy after "NOTE: Executing Tasks" is shown16:44
RPJaMa: that makes sense, the workers get spawned but just return so only the cooker would be loaded16:45
*** blueness <blueness!~blueness@gentoo/developer/blueness> has joined #yocto16:46
JaMaarent workers spawned only when it also shows something like "Executing task (1 from 71749)" ?16:47
RPJaMa: yes16:48
JaMaI'm not that far, it will take 1-3 hours until I get to that point16:48
JaMaInitialising tasks: 100% |###############################################################################################################################################################################################################################| Time: 0:03:2016:48
JaMaSstate summary: Wanted 25260 Found 0 Missed 25260 Current 0 (0% match, 0% complete)16:48
JaMaNOTE: Executing Tasks16:48
RPJaMa: Setting to 8 threads gave 3m116:50
JaMainteresting, lets see what the profile data will show, but will probably send it tomorrow16:51
JaMathis build has also multilib enabled, so it almost doubles the number of available tasks, I should have used something a bit smaller16:52
*** georgem_home <georgem_home!uid210681@gateway/web/> has quit IRC16:56
*** nagio <nagio!> has joined #yocto16:59
*** User_ <User_!~learningc@> has joined #yocto17:00
*** learningc <learningc!~learningc@> has quit IRC17:04
RPJaMa: I'll have to run some more experiments, a build with meta-oe in here is definitely slower :/17:13
RPJaMa: 33000 tasks, 22minutes17:15
*** goliath <goliath!> has quit IRC17:19
JaMaRP: this is with latest bitbake?17:26
RPJaMa: yes17:27
RPJaMa: putting it under profiling now but I really need to look at the correctness problems of the last build :/17:28
RPLooks like we've got some kind of nasty metadata race somewhere :/17:29
JaMawill bisect where bitbake started terminating PRserv, it's not one of 2 commits I was suspecting (the cleanup for termination)17:31
JaMalooks like this one: 05888700 cooker: Improve hash server startup code to avoid exit tracebacks17:31
JaMathis issue is reproducible with 'PRSERV_HOST = "localhost:0"' in local.conf17:33
JaMabut probably not related to the delay I'm seeing, because for last run with -P I've disabled PRserv and rm_work as well and it's still sitting between "Executing tasks" and executing them17:34
JaMaRP: if I remove self.handlePRServ() added to reset() it works again17:39
RPJaMa: right, that would account for the socket issue but probably not the delays17:40
*** T_UNIX <T_UNIX!uid218288@gateway/web/> has quit IRC17:52
*** Bunio_FH <Bunio_FH!> has joined #yocto17:52
JPEWRP: Added more stats logging and multiple threads to the hash server:
dkcdo you have tips on how I could include the git revision of our custom layer in the yocto image? I have a hacky solution with a "nostamp" task, but as a consequence it forces the generation of the rootfs even if nothing changed, I'd like to avoid that18:07
JaMadkc: check ./meta/classes/image-buildinfo.bbclass18:15
dkcJaMa: looks like exactly like what I need, thanks18:20
*** dreyna <dreyna!> has quit IRC18:21
JaMais it expected that core-image-minimal depends on things like libx11 now? It seems to be caused by systemd -> dbus -> libx11: "dbus.do_package" -> "libx11.do_packagedata"18:31
*** Bunio_FH <Bunio_FH!> has joined #yocto18:31
*** Bunio_FH <Bunio_FH!> has quit IRC18:34
*** goliath <goliath!> has joined #yocto18:34
mischiefit seems if i use fitimage and also INITRAMFS_IMAGE_BUNDLE my kernel has the initramfs twice. is there a way to prevent that?18:39
jwesselRP: I got another hit with more logs and things line up to a lack of understanding how this could happen.18:40
jwesselI am not so sure that pseudo is the problem either.18:40
jwesselIf we assume that pseudo is not doing the inode assignment.18:41
jwesselIt is asked to instantiate a hard link, but there is no "origination" location.18:41
jwessel(exact 1 d /L/2.28-r0/locale-tree/usr/lib64/locale/ar_JO.ISO-8859-6/LC_NUMERIC) (exact 1 duf /L/2.28-r0/locale-tree/usr/lib64/locale/ar_JO.ISO-8859-6/LC_NUMERIC)18:42
jwessel(new?) new link: path /L/2.28-r0/locale-tree/usr/lib64/locale/ar_DZ.ISO-8859-6/LC_NUMERIC.tmp, ino 190901401, mode 010064418:42
jwesselNo origin inode data: 190901401 [ no path ]18:42
jwesselI went and poked the actual dir structure, and sure enough.  That is the only instance of that inode.  It should have been linked off to other locations with the LC_NUMERIC, since they are all the same file.18:43
jwesselI don't really understand how that can happen though.18:43
RPjwessel: How can you hardlink to something which doesn't have an inode you're linking to?18:44
jwesselI have no idea.  That is why I put a new log line in there, to prove it was hitting this condition.18:44
jwesselI need to trace back to the requestor, since down at the pseudo DB level / server level that information is long gone.18:45
jwesselSome how this file was turned into a copy instead of a link.18:45
RPjwessel: It at least gives a clue on what we're looking for...18:45
jwesselAll the failures (I have 3 so far), have the exact same signature.18:46
jwesselI thought about putting in clean up code in pseudo to "fix it up" but it clearly isn't the right way to deal with this.18:47
jwesselWe have a garbage in, garbage out situation.18:47
jwesselI just don't understand where the garbage came from.  Clearly it is an error which should probably be fatal if you are asked to hard link something for which there is no reference.18:48
RPjwessel: right, I think we need to understand it more...18:48
jwesselI don't get how it becomes a copy though.18:48
RPthat is odd...18:49
jwesselJust thought I'd provide an update or if you or anyone else has insights, I am open to any input.18:49
RPjwessel: I appreciate the update but I'm not close enough to the code to have useful input :(18:50
RPjwessel: I do agree its odd though as how would a hardlink become a copy. Maybe a libc call fails and this is the fallback?18:50
jwesselI was chatting with marka earlier.  He mentioned there is glibc.bbclass or something which deals with some of this.18:51
*** fray <fray!> has joined #yocto18:51
jwesselIt is specific to how the locales are copied.18:51
jwesselI'll look there next.18:51
jwesselI am not sure how to track down the caller yet, but we do know that by the time pseudo (the server side) is asked to process the hard link, it is already trashed.18:52
RPjwessel: yes, if its being copied in our code then it will try a hardlink, if it fails it will resort to a copy18:52
jwesselIt is still possible the pseudo client end is broken in some way.18:52
RPjwessel: we have a the copy fall back for cases spanning different disks or similar18:53
RPjwessel: Its and libc-package.bbclass18:55
jwesselI'll have to figure out what to instrument, but I'll start with old fashion code inspection first.18:55
jwesselI'd like to see what lines with what prints out in the logs.18:55
jwesselhmm... I seem to have stumbled on what happened, but it will take a while to figure out what to instrument next.19:05
*** tgraydon <tgraydon!tgraydon@nat/intel/x-skpphsfafofltkvt> has joined #yocto19:05
jwesselThe evidence show the original file was created and reference was purged.19:05
jwesselso it became a copy.19:05
*** asabil <asabil!~asabil@2a01:79d:7375:2ca4:fd3d:19c0:f6c1:5aea> has joined #yocto19:08
RPjwessel: I wonder if some kind of atomic op guarantee on a libc call was broken by pseudo?19:12
jwesselGood question.  I am inspecting through the pseudo code first, then I need to try and find the caller.19:12
fraycalls used to be blocking until recently.. but that isn't the issue cause this stuff was broken before that change..19:13
jwessel   case OP_MAY_UNLINK:19:16
jwessel        if (pdb_may_unlink_file(msg, msg->client)) {19:16
jwessel            /* harmless, but client wants to know so it knows19:16
jwessel             * whether to follow up... */19:16
jwessel            msg->result = RESULT_FAIL;19:16
jwessel        }19:16
jwesselI am not so sure we do the right thing...  Or a badly written app might not do the right thing.19:16
jwesselMore instrumentation is required to find out exactly where we go off the rails.19:17
*** elGamal <elGamal!~elg@> has quit IRC19:30
*** learningc <learningc!~learningc@> has joined #yocto19:45
*** User_ <User_!~learningc@> has quit IRC19:47
*** vpaladino778 <vpaladino778!> has joined #yocto19:51
vpaladino778Hey folks. I was given a Poky SDK Installer. I ran it and tried running the 'environment-setup-armv5e-poky-linux-gnueabi' file that it created, but nothing seems to be happening19:51
Crofton|workthat just sets some environment variables19:53
vpaladino778Do you know how i would compile a program using the Poky SDK that i installed?19:54
frayonce you source it.. call 'make' or autoconf or....19:57
frayif you just want to compile a single file, $CC -o output input.c19:57
frayno magic involved.. standard environment variables are set19:57
*** asabil <asabil!~asabil@2a01:79d:7375:2ca4:fd3d:19c0:f6c1:5aea> has quit IRC19:59
vpaladino778So after running 'environment-setup' i can just compile as i normally would and it will compile to the targets platform?20:03
fraythat is the idea.. just be sure you use the environment variables and not direct calls to 'gcc'20:04
vpaladino778Ahah. I understand. Thank you for your help.20:07
vpaladino778One last question. How can i make sure that i 'use the ennironment variables'? Do i just have to make sure i run the 'environment-setup' file in the current shell session?20:08
vpaladino778Sorry for me naivety. I'm a new-grad and this is all pretty new to me20:09
JPEWvpaladino778: Make sure your build system (make, autotools, meson, cmake, whatever) uses them20:09
JPEWIncluding using them yourself if you are your own build system, e.g.: `$CC -o hello -c hello.c`20:10
*** linuxjacques <linuxjacques!~jacques@nslu2-linux/jacques> has quit IRC20:12
*** bluelightning <bluelightning!~paul@pdpc/supporter/professional/bluelightning> has joined #yocto20:14
*** leitao <leitao!~leitao@2620:10d:c092:200::1:acf8> has quit IRC20:14
*** linuxjacques <linuxjacques!~jacques@nslu2-linux/jacques> has joined #yocto20:25
*** otavio <otavio!~otavio@debian/developer/otavio> has joined #yocto20:32
*** vpaladino778 <vpaladino778!> has quit IRC20:37
*** asabil <asabil!~asabil@2a01:79d:7375:2ca4:fd3d:19c0:f6c1:5aea> has joined #yocto20:43
*** behanw <behanw!uid110099@gateway/web/> has quit IRC20:49
RPJPEW: Going to try that patch, thanks!20:49
*** elGamal <elGamal!~elg@> has joined #yocto20:53
*** nabokov <nabokov!~armand@> has quit IRC20:55
*** asabil_ <asabil_!~asabil@2a01:79d:7375:2ca4:195:1cc1:2b31:ba74> has joined #yocto21:00
*** asabil <asabil!~asabil@2a01:79d:7375:2ca4:fd3d:19c0:f6c1:5aea> has quit IRC21:03
*** behanw <behanw!uid110099@gateway/web/> has joined #yocto21:04
JaMajwessel: have you seen this? it's definitelly badly written app and breaks pseudo every single time (maybe with different root cause though)21:05
jwesselI had not seen that one.21:05
JaMamore details in
yoctiBug 12434: normal, Medium+, 2.8 M3, randy.macleod, ACCEPTED , pseudo: Incorrect UID/GID in packaged files21:06
RPJPEW: To continue that conversation from earlier, I just dreamt the idea of task specific hosttools using symlinks21:10
RPIts so crazy we have to do it...21:10
jwesselJaMa: It doesn't seem at first glance that is the same problem.21:10
RPJPEW: would solve the task rss contamination problem in theory to a large extent21:11
*** berton <berton!~berton@> has quit IRC21:17
RPjwessel: we've had far too many rays of hope with this bug where we think we may have found it, then not....21:24
jwesselI don't doubt that.  This is fairly complex.21:24
jwesselI see something happening consistently with each build, even in the ones that don't actually fail.21:25
RPjwessel: I've suspected/hoped for that. Question is why some fail and some don't21:26
RPjwessel: if we could make it 100% failing we'd no doubt quickly figure it out :)21:26
jwesselThat answer is pretty easy, from the logs.21:26
jwesselIf I add some sleep into pseudo, it will probably fail every time, but I am not sure.21:26
jwesselThe logs indicate it is picking up the underlying (what ever was used last) for the inode in the case of the hard link that was turned to a file.21:27
jwesselIt happens a few times in every single build, but the ones that pass are the ones where has selected UID 0.21:27
jwesselIt really should be 100% impossible for the hard link to have no source reference.21:28
jwesselThat is why I put a print log there.  I figured it shouldn't be hitting that line of code _ever_21:28
JaMainteresting part for me was that it's consistent when restoring "bad" do_package archive from sstate (so once you create and store bad archive it will always fail as long as you're using the same sstate signature)21:29
jwesselI haven't been able to make a shell script which makes it act the same.21:29
RPjwessel: so my inode reuse theory is actually right!21:29
jwesselWell technically it appears to be the pseudo cache.21:29
jwesselThe problem is a bit hard to track because each instance involves 3 inodes.21:30
RPjwessel: Right, that was what I'd theorised though - that some file known to pseudo was deleted and then a new file was created using the same inode so the permissions/user were copied21:30
JaMaand that pseudo writtes a lot of warnings and errors in each build even the ones which ended probably OK :)21:31
RPjwessel: it kind of implies we're missing deletion tracking with pseudo which is possible as we remove files out of pseudo context21:31
jwesselIs there any way that is happening?21:31
RPjwessel: I wish we could do path filtering in pseudo :/21:31
RPjwessel: absolutely certain its happening21:31
jwesselIt looked to me like the delete was actually there, and it more like there is something happening in parallel.21:31
RPjwessel: it was only a theory but I know we do delete files out of context21:32
RP(why would we load pseudo and do deletion under it?)21:32
jwesselAs long as you are sane about it, it should be ok.21:32
RPjwessel: what could be interesting would be doing a sanity check of pseudo's db against the disk periodically21:33
jwesselI can say with certainty I know how to re-use the inodes.21:33
jwesselOn a quiet disk I can allocate and re-allocate any which way I want.21:33
RPjwessel: I never figured that out :)21:34
jwesselBut I am trying to limit it down to something reproducible.21:34
RPjwessel: right, that is what we need21:34
jwesselI can't see at the moment how to get it to the state where the ln proceeds without the whole "load my system and hope it happens..."21:34
jwesselAt this point I do see the bad state with every single run of the do_package.21:35
* jwessel has a few more tests cases to try and duplicate it, before the day over for now.21:36
* RP is hoping jwessel can figure it out :)21:41
jwesselThis might take days.21:41
*** asabil_ <asabil_!~asabil@2a01:79d:7375:2ca4:195:1cc1:2b31:ba74> has quit IRC21:42
jwesselAs with all problems like this, finding the root cause doesn't mean it will be easy to fix.21:42
RPJPEW: new server swaps build warnings for failures:
RPjwessel: I know. I'm dealing with  a similar multiheaded monster in the form of runqueue too :/21:44
RPJPEW: {"connections": {"total_time": 130.6708268285729, "max_time": 0.027425227221101522, "num": 485853, "average": 0.0002689513635370635, "stdev": 0.0002935243837515032}, "requests": {"total_time": 824.3801162638701, "max_time": 0.43299578316509724, "num": 485848, "average": 0.0016967860653205739, "stdev": 0.0015868506651319188}}21:46
JPEWRP: hmm... did you remove the io and sql  stats?21:47
RPJPEW: no?21:47
RPJPEW: oh, I think I've missed a commit21:48
JPEWAh.... I'm actually impressed that applies cleanly :)21:48
RPJPEW: yes, so am I looking at it!21:49
JPEWRP: Git is a mysterious entity21:49
RPJPEW: would that cause the connection reset issue?21:49
JPEWI have been running bitbake-selftest hashserv.tests on my changes21:50
JPEWIf the server was getting a python exception in the thread I think that would cause a connection reset21:51
RPJPEW: that was what I was just thinking looking at the missing import21:52
JaMaRP: time bitbake world -P -n is still running after over 5 hours not yet executing tasks (on older bitbake) so it might be even slower than 253min run done earlier today on the same build21:53
*** georgem_home <georgem_home!uid210681@gateway/web/> has joined #yocto21:56
RPJaMa: That is just crazy :(21:57
RPJaMa: I need to figure out what is wrong. I think my more limited world build should give me the baseline profile data I need...21:57
RPJPEW: I've sorted out the patches and restarted everything, lets try again21:58
* RP knows he's going to run out of time on this soon :(21:58
RPJPEW: :(22:06
JPEWRP: Argh.... OK.22:08
RPI'll try and keep the logging but revert back to the previous threading22:09
JPEWRP: OK. I have to go home, but I'll give it a think.22:09
RPJPEW: thanks, this was a good try :)22:09
*** agust <agust!> has quit IRC22:10
RPJPEW: I'm just reverting so I can test the other changes in -next which are a mix of normal patches and runqueue fixes22:10
jwesselSo, in a non-failed build I absolutely have the same problem each time, as I mentioned before.22:20
jwesselI see how it happens, but still don't know the root cause.22:21
jwesselThere are is pile of these requests coming in, parallel.   And the mv bit is not atomic.22:22
jwesselThe unlink occurs before the rename22:23
jwesselI haven't determined that the problem is actually the fault of pseudo or not.22:23
jwesselI don't understand how all these db requests are coming in, in parallel.22:23
jwesselBTW, if we were to add a QA check against the files we know are supposed to all be hard linked together.  It would fail 100% of the time.22:25
jwesselWe get a couple strays that became hard copies each time.22:25
*** BCMM <BCMM!~BCMM@unaffiliated/bcmm> has quit IRC22:26
RPjwessel: That sounds like a good way to debug this if we have a sentinel we can spot....22:26
jwesselBy luck of the draw most of the time it has not failed due to the QA check being for the builder UID.22:26
RPjwessel: most files we delete would be owned by "root", not the build user22:26
jwesselIt is going to be a different file every time, but the investigation thus far shows 100% failure rate.22:26
RPright statistically we process enough files something breaks22:27
jwesselI really need to have a hard look at the operations.22:27
jwesselBut that is it for now.22:27
jwesselFor today anyway.22:27
RPjwessel: its great progress as I can think of ways of debugging this further from there! :)22:28
*** neverpanic <neverpanic!> has quit IRC22:30
*** neverpanic <neverpanic!> has joined #yocto22:32
*** tijko <tijko!~tijko@unaffiliated/tijko> has quit IRC22:40
RPkanavin, JaMa: the notifications are about the generation of task specific profiles22:42
RPJPEW: managed another connection reset without the threading patch :(22:46
RPdefinitely not as common22:46
JaMaRP: I'm getting close with that profile, not on task 50K from 62K22:54
JaManow when some tasks are "running" there is some load on the worker process as well22:56
RPJaMa: that makes sense since I guess its instructed to fork()22:57
*** vineela <vineela!vtummala@nat/intel/x-hjjhjvtldyngfbkt> has quit IRC23:10
JaMaheh, it finished.. *a lot* of profile* files in this directory :), will upload in a sec23:10
JaMathe first is the "short 5 min" build til "Executing Tasks", then the profile.log.processed and last is from profile-worker.log.processed23:17
JaMaall 3 with bitbake 1f630fdf0260db08541d3ca9f25f852931c19905 (before most runqueue changes)23:17
JaMamaybe the logger.debug in scenequeue_updatecounters()?23:35
*** goliath <goliath!> has quit IRC23:38
*** florian <florian!~florian_k@Maemo/community/contributor/florian> has quit IRC23:56

Generated by 2.11.0 by Marius Gedminas - find it at!