## Moved Page from internal Wiki ## page was renamed from BarrelFish/SystemKnowledgeBase ## page was renamed from BarrelFish/SKB This page tries to give an overview of all the facts that are used in the SKB. = SKB (System Knowledge Base) = 1. Adrian gave a presentation at the Barcelona Workshop (2010) which is helpful as an introduction: [[http://wiki.barrelfish.org/BarcelonaWorkshop2010?action=AttachFile&do=view&target=skb.pdf]] 1. Check [[http://www.barrelfish.org/publications/schuepbach-declarative-os.pdf|Adrians Phd Thesis]] for a good introduction of what the SKB does and how it does it. == Checklist == a. What does the name of the fact stand for a. What information is contained in this fact (what is in the first, second, third argument) a. Where does this information come from (e.g., ACPI tables, generated based on other facts etc.) and who adds it to the SKB? a. Feel free to add anything I missed a. Bonus (for when we have the above questions for all the facts): How is this information used for in BF (or is it not used at all?) and for what can it be used for in the future? == Fact: addr(arg1, arg2, arg3) == Description: 1. '''arg1''': 1. '''arg2''': 1. '''arg3''': == Static == These facts provide a basic representation of the hardware and are initialized upon boot. * '''apic(ApicProcessorID, ApicID, Usable)''' * Description: Registration of each CPU and its APIC ID. 1. ApicProcessorID [int] - The processor's Processor ID. 1. ApicID [int] - The processor's local APIC ID. 1. Usable [int] - "If zero, this processor is unusable, and the operating system support will not attempt to use it." * These details originate from the ACPI APIC table (MADT: Multiple APIC Description Table). Section 5.2.12.2, page 137 of the [[http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf|ACPI Specification]]) contains additional details. * If you are interested in the mapping between barrelfish CoreID and this entries, use '''corename''' * '''ioapic(Id, Address, !GlobalIrqBase)''' * Description: Registration of each I/O APIC device with its {{{Id}}} and the {{{Address}}} where it can be accessed. 1. Id [uint8] - I/O APIC ID 1. Address [uint32] - APIC physical address where this I/O APIC can be accessed 1. !FlobalIrqBase [uint32] - Flobal system interrupt where INITI (interrupt inputs) lines start. * defined in struct ACPI_MADT_IO_APIC (usr/acpi/acpica/include.h) * added to SKB in usr/acpi/interrupts.c * more details can be found at Section 5.2.12.3 I/O APIC Structure of the [[http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf|ACPI Specs]]. * '''apic_nmi(ProcessorID, !IntiFlags, Lint)''' * Description: Registers the LINT (Local APIC interrupt input) that is NMI (non-maskable) for the given ProcessorID. 1. ProcessorID [uint8] - ACPI processor id 1. !IntiFlags [uint16] - MPS INTI flags. Encodes trigger mode/polarity. 1. Lint [uint8] - LINTn to which NMI is connected * defined in struct ACPI_MADT_LOCAL_APIC_NMI (usr/acpi/acpica/include.h) * more details can be found at Sections 5.2.12.6 and 5.2.12.7 (MADT) of the [[http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf|ACPI Specs]]. * '''interrupt_override(Bus, SourceIRQ, GlobalIRQ, IntiFlags)''' * Description: Relation between legacy ISA interrupts and GSI 1. Bus[int] - Bus number. Always 0 according to specs. 1. !SourceIrq[int] - ISA interrupt number 1. GSI[int] - GSI number 1. !IntiFlags - MPS INTI flags. Encodes trigger mode/polarity. * It is added by the ACPI service, it originates from the ACPI MADT table. * See section 5.2.12.5, (MADT) of the [[http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf|ACPI Specs]]. * '''cpu_affinity(ApicID, !LocalSapicEid, !ProximityDomain)''' * Description: Mapping from APIC or SAPIC ID to NUMA domain. 1. ApicID [int] - join with {{{apic}}} fact. 1. !LocalSapicEid [int] - not generally used; most use cases will have an APIC ID. 1. !ProximityDomain [uint32] - NUMA domain number (starting from 0), join with {{{memory_affinity}}}. * This information is added by the ACPI service and comes from SRAT (System Resource Affinity Table). For more details see section 5.2.16, page 151 of the [[http://www.acpi.info/DOWNLOADS/ACPIspec50.pdf|ACPI Specification]]). * Warning: these facts may not be available if the corresponding ACPI table is not present. * '''node_distance(!ProximityDomainFrom, !ProximityDomainTo, !Distance)''' * Description: Distance information between two NUMA nodes 1. !ProximityDomainFrom [int] - NUMA ID of sender. 1. !ProximityDomainTo [int] - NUMA ID of receiver. 1. !Distance [uint8] - distance information. * This information is added by the ACPI service and comes from SLIT see ACPI specification. * Warning: this facts may not be added when the table is not present * '''memory_affinity(!BaseAddress, Length, !ProximityDomain)''' * Description: association between physical memory and the NUMA domain it belongs to. 1. !BaseAddress [uint64] - these two fields denote a range of physical memory [!BaseAddress, !BaseAddress+Length) 1. Length[uint64] 1. !ProximityDomain [uint32] - NUMA domain number * Use case: implementation of NUMA-aware memory allocator. * As with {{{cpu_affinity}}}, this is also taken from the ACPI SRAT table. * There is also some extra flags which is not exported (bool hotpluggable, bool nonvolatile) * '''mem_region_type(Type, Name)''' * Description: set of predefined values for the {{{Type}}} field of {{{memory_region}}}. 1. Type [int] - integer values corresponding to 'enum region_type' in . 1. Name [atom] - convenience name for use from Prolog. * Example content: {{{ mem_region_type(0, ram). # RegionType_Empty mem_region_type(1, roottask). # RegionType_RootTask mem_region_type(2, phyaddr). # RegionType_PhyAddr mem_region_type(4, multiboot_module). # RegionType_Module mem_region_type(3, platform_data). # RegionType_PlatformData mem_region_type(5, apic). # RegionType_LocalAPIC mem_region_type(6, ioapic). # RegionType_IOAPIC }}} === Intel Vt-d === These facts are discovered by ACPI upon boot and describe the I/OMMUs found in the system. The ACPI tables are described in [[http://www.intel.com/content/www/us/en/embedded/technology/virtualization/vt-directed-io-spec.html|vt-directed-io.pdf]]. * '''dmar(Flags)''' * Description: Indicates the presence of a DMAR table in ACPI. 1. Flags - A bitfield. * Bit 0 - intr_remap - If set, the I/OMMU supports interrupt remapping * Bit 1 - x2apic_opt_out - If set, the OS should not go into x2apic mode. * '''dmar_hardware_unit(Index, Flags, Segment, Address)''' * Description: Indicates the presence of a DMA remapping hardware for the 1. Index[int] - index matching index in dmar_device 1. Flags - A bitfield. * Bit 0 - Include PCI all? 1. Segment[uint16] - The PCI segment this remapping unit is for 1. Address - The Register Base Address where this unit can be accessed. * '''dmar_device(Index, !AcpiDmarType, !EntryType, Addr(_,_,_,_), !EnumerationId). ''' * Description: The result of parsing the device scope ACPI tables. It describes the mapping of devices to, for example, DMAR devices. 1. Index - Index for matching it with other dmar_facts 1. !AcpiDmarType[uint8] = Remapping Structure Type * 0 - DRHD - DMA Hardware Remapping unit definition * 1 - RMRR - Reserved memory Region reporting structure * 2 - ATSR - Root Port ATS Capability Reporting * 3 - RHSA - Remapping Hardware Static Affinity Structure * 4 - ANDD - ACPI Name-space Device Declaration Structure 1. !EntryType[uint8] * 1 - PCI Endpoint device, addr describes PCI address * 2 - PCI Sub hierarchy, addr describes PCI sub hierarchy * 3 - IOAPIC, !EnumerationId contains IOAPIC ID * 4 - MSI capable HPET, !EnumerationId contains HPET Number * 5 - ACPI Namespace device, addr contains ACPI name(? Not sure how this is encoded). 1. Addr is a 4 tuple of Segment, Bus, Device, Function, describing the PCI address. 1. !EnumerationId[uint8] - see 2. * '''dmar_reserved_memory(Index, Segment, !BaseAdress, !EndAdress)''' * Description: Indicates Reserved BIOS Memory Regions that may be targets for DMA. Used for legacy devices (USB, ...). The devices that need access to this area are given using the dmar_device predicate. If not present, there are no such devices. 1. Index - Index for matching it with other dmar_facts 1. Segment[uint16] - PCI segment 2. !BaseAddress[uint64] - 64bit memory base address 3. !LimitAddress[uint64] - 64bit memory limit * '''dmar_atsr(Index, Flags, Segment)''' * Description: An ATSR structure is provided for each PCI segment supporting Device-TLBs. If not present, there are no such devices. 1. Index - Index for matching it with other dmar_facts 1. Flags - Bitfield * Bit 0 - ALL_PORTS. When set, all devices support ATS transactions, otherwise, those described through dmar_device. 2. Segment[uint16] - PCI segment. * '''dmar_rhsa(Index, !BaseAddress, !ProximityDomain)''' * Description: Optional. Remapping Hardware Static Affinity structure. 1. Index - Index for matching it with other dmar_facts 1. !BaseAddress[uint64] - The base address 2. !ProximityDomain[uint32] - Proximity Domain * '''dmar_andd(Index, !DeviceNumber, !ObjectName)''' * Description: ACPI Name-space Device Declaration structure (ANDD). This provides a lookup for dmar_device entries that have !EntryType=ACPI Namespace. 1. Index - Index for matching it with other dmar_facts 1. !DeviceNumber[uint8] - The device number. Unique. 2. !ObjectName[string] - The ACPI path == Dynamic == These facts describe the current state of the system and are updated periodically at runtime. * '''device''' * 1. Addr/3 1. Vendor 1. !DeviceId 1. Class 1. Subclass 1. Programming Interface 1. !InterruptPin: Four of the physical pins on the PCI card carry interrupts from the card to the PCI bus. The standard labels these as A, B, C and D. The Interrupt Pin field describes which of these pins this PCI device uses. Generally it is hardwired for a pariticular device. That is, every time the system boots the device uses the same interrupt pin. This information allows the interrupt handling subsystem to manage interrupts from this device. Value is read from PCI HW registers. * '''memory_region(!BaseAddress, Bits, Size, Type, Data)''' * Description: representation of 'struct mem_region' in . 1. !BaseAddress [GENPADDR] - this together with 'Size' defines a region of memory [!BaseAddress, !BaseAddress+Size). TODO: clarify if these are physical or virtual addresses. 1. Bits [uint] - If Bits == 0, then Size == 1 << Bits. May not always be the case for non power-of-two memory regions. 1. Size [size_t] 1. Type [uint] - enum values defined in {{{mem_region_type}}}. 1. Data [ptrdiff_t?] * ''' binding(Id, !EventBinding, !RpcBinding) ''' * Octopus uses two bindings to communicate with one client. This fact is used to store the pointer for RPC and Event binding structs and map from one to the other. Event or RPC binding can also be 0 in case the client has only registered one binding (yet). 1. Id [uint] - Unique number for every binding fact stores 1. !EventBinding [uint64] - Value of Pointer to in-memory C binding structure created by flounder 1. !RpcBinding [uint64] - Value of Pointer to in-memory C binding structure created by flounder * ''' corename(CoreID, Architecture, apic(ApicID)) ''' 1. CoreID [uint8] - assigned sequentially starting from 0 as each core is booted, but may have hole if a core is not present. 1. Architecture [atom] - currently hardcoded as 'x86_64', there is an open TODO comment. 1. ApicID [uint64] - see fact {{{apic}}}. * Notes: * Stored by 'usr/kaluga/start_cpu.c' and 'usr/spawnd/bsp_bootup.c', but unsure which is actually active. ApicID retrieved from Octopus or with another SKB query ({{{get_apic_id_list}}}). * Used by 'usr/acpi/interrupts.c' ({{{enable_and_route_interrupt}}}) and 'usr/skb/programs/queries.pl' (1. {{{get_core_id_list}}} exports this as a sorted list, 2. in implementation of {{{ram_closest_to}}} query) * ''' assignedGsi(Addr, Pin, Gsi) ''' * Assigned Global System Interrupt (see 5.2.11 in [1]). This fact is added during the PCI bus enumeration. The PCI service calls the function assigndeviceirq(Addr) for each device it finds which in turn stores the assignedGsi fact to cache the GSI assignments. Global System Interrupts can be thought of as ACPI Plug and Play IRQ numbers. They are used to virtualize interrupts in tables and in ASL methods that perform resource allocation of interrupts. (i.e., they give a unique number for each interrupt pin in a system). 1. Addr [addr/3] - Address of the Device (addr in device fact). 1. Pin [Integer] - Pin of the Device (see Interrupt Pin in device fact) 1. Gsi [Integer] - * '''usedGsi(Int, fixedGsi)''' * Fact to cache already assigned GSIs. * '''pir(Source, Gsi) ''' * For each link device, pir facts are added describing the possible GSIs that may be selected for a given device. 1. Source [String] - Name of Link Device 1. Gsi [int] - Global System Interrupt Number * Example: pir("\\_SB_.PCI0.LSMB", 20). * ''' prt(Addr, Pin, pir(Int) | gsi(Int)) ''' * At start-up, the PCI and ACPI drivers add a fact for every PCI interrupt routing table entry, mapping a device address and interrupt pin to a source. 1. Addr/3 - Address of PCI device 1. Pin [int 0-3] - Pin of the PCI device 1. pir(int) | gsi(int) - Interrupt link device or GSI that this interrupt is allocated to. * '''setPir(String, Gsi)''' * Used to store already allocated GSI <-> Link device mappings. 1. Name of Link Device 1. Global System Interrupt * '''addr(Bus,Device,Function) [pci,acpi]''' * Describes PCI addresses. Only used in other facts. 1. {{{Bus}}} bus on which the device lives 1. {{{Device}}} PCI device id (per bus) on which the device lives 1. {{{Function}}} PCI device function (per device id) on which the device lives * '''ahci_device(Id) [ahcid]''' * Provides an enumeration of all available SATA ports in the system. Not used anywhere 1. {{{Id}}} SATA port id * '''bar(addr(Bus,Device,Function), Id, !BaseAddr, Size, Space, !IsPrefetchable, Type) [pci]''' * Description: Base Address Register for PCI device 1. {{{addr}}}: Device address, see {{{addr/3}}} 1. {{{Id}}}: BAR id for that device 1. {{{BaseAddr}}}: Base address of region indicated by BAR 1. {{{Size}}}: Size of region 1. {{{Space}}}: ["mem", "io"], Indicates whether BAR refers to IO space or 1. {{{Memory. 1. {{{IsPrefetchable}}}: {{{["prefetchable", "nonprefetchable"]}}}, Indicates whether SW can prefetch data in region 1. {{{Type}}}: {{{[32, 64]}}}, Indicates whether BAR is 32 bit or 64 bit wide See query_bars() in usr/pci/pci.c for details on how these facts get acquired. * '''bridge(Type, addr(Bus, Device, Function), Vendor, !DeviceId, Class, !SubClass, !ProgIf, secondary(!BusNum)) [pci]''' * Describes a PCI (express) bridge device 1. {{{Type}}}: {{{["pcie", "pci"]}}}, Distinguish between PCI and PCI express bridges 1. {{{addr}}}: Device address, see {{{addr/3}}}. 1. {{{Vendor}}}: Device vendor 1. {{{DeviceId}}}: Device DeviceId 1. {{{Class}}}: Device class 1. {{{SubClass}}}: Device subclass 1. {{{ProgIf}}}: Device programming interface 1. {{{secondary(BusNum)}}}: secondary bus number (XXX: get info on this) * '''device(Type, addr(Bus,Device,Function), Vendor, !DeviceId, Class, !SubClass, !ProgIf, !IrqPin) [pci]''' * Describes a PCI (express) device 1. {{{Type}}}: {{{["pcie","pci"]}}}, Indicates whether device is PCI or PCI express. 1. {{{addr}}}: Device address, see {{{addr/3}}}. 1. {{{Vendor}}}: Device vendor 1. {{{DeviceId}}}: Device Id 1. {{{Class}}}: Device class 1. {{{SubClass}}}: Device subclass 1. {{{ProgIf}}}: Device programming interface 1. {{{IrqPin}}}: interrupt pin for device * '''currentbar(addr(Bus,Device,Function), BAR, Base, High, Size) [pci, doesn't occur in any call skb_add_fact()]''' * Used in Prolog queries, returns the current BAR for a device 1. {{{addr}}}: Device address, see {{{addr/3}}}. 1. {{{BAR}}}: current base address register of device at addr. 1. {{{Base}}}: current base address 1. {{{High}}}: end of region indicated by BAR 1. {{{Size}}}: size of region * '''rootbridge(addr(Bus,Device,Function), childbus(Min,Max), mem(Min,Max)) [acpi]''' * Describes a PCI (express) root bridge in the system 1. {{{addr}}}: Device address, see {{{addr/3}}}. 1. {{{childbus(Min,Max)}}}: bus numbers of children of this bus are between {{{Min}}} and {{{Max}}}. 1. {{{mem(Min,Max)}}}: memory region covered by this rootbridge is {{{[Min, Max]}}} * '''rootbridge_address_window(addr(Bus,Device,Function), mem(Min,Max)) [acpi]''' * Describes the memory region covered by a PCI (express) root bridge. 1. {{{addr}}}: Device address, see {{{addr/3}}}. 1. {{{mem(Min,Max)}}}: memory region belonging to device at addr is {{{[Min, Max]}}} * '''childbus(!MinBus,!MaxBus) [acpi]''' * bus numbers of children of the current bus are between the input values 1. {{{MinBus}}}: The minimum bus number that is assigned 1. {{{MaxBus}}}: The maximum bus number that is assigned * used in SKB for root and rootbridge related functions. * '''secondary(Busnum) [pci]''' * refers to the secondary bus 1. {{{Busnum}}}: bus number of the secondary bus * used in the SKB for bridge / device related programs == Other Notes == * This list is taken from Adrian's PhD Thesis (section 3.5.2) and may be useful as a starting point: {{{ apic_nmi(ACPI_ProcessorID, IntiFlags, Lint). bar(addr(Bus, Dev, Fun), BARNr, Base, Size, mem|io, (non)prefetchable, Bits (64|32)). bridge(pcie|pci, addr(Bus, Dev, Fun), VendorID, DeviceID, Class, SubClass, ProgIf, secondary(Sec)). currentbar(addr(Bus, Dev, Fun), BARNr, Base, Limit, Size). device(pcie|pci, addr(Bus, Dev, Fun), VendorID, DeviceID, Class, SubClass, ProgIf, IntPin). fixed_memory(Base, Limit). interrupt_override(Bus, SourceIRQ, GlobalIRQ, IntiFlags). ioapic(APICID, Base, Global_IRQ_Base). message_rtt(StartCore, DestCore, Avg, Var, Min, Max). nr_running_cores(Nr). rootbridge(addr(Bus, Dev, Fun), childbus(MinBus, MaxBus), mem(Base, Limit)). -> Can join first parameter of 'mem' with memory_region fact. rootbridge_address_window(addr(Bus, Dev, Fun), mem(Min, Max)). }}} == CPU Information == Updated CPU information based on the SKB datagatherer /usr/skb/measurement {{{ vendor(Core_ID,[Vendor_String]). cpu_family(Core_ID, Vendor_String, Family, Model, Stepping). cpu_thread(Core_ID, Package, Core, HyperThread). cpu_cache(Core_ID, Name, Level, data|instr|unified, Size, Associativity, LineSize, Shared, Inclusive). cpu_tlb(Core_ID, data|instr|unified, level, PageSize, Entries, Associativity). cpu_addrspace(Core_ID, BitsPhys, BitsVirt, BitsGuest). }}} * Proposed changes: * Rename some facts to better reflect what they stand for (e.g. {{{cpu_affinity}}} -> {{{apic_to_numa_mapping}}}). * It seems APIC IDs have a structure which reflects the topology (NUMA domain, chip, core, hyperthread). This is useful to know and could be easily exposed with some utility functions. See: [[http://wiki.osdev.org/Detecting_CPU_Topology_(80x86)]] * CPUID: only available if datagatherer activated, differing codepaths and facts for AMD/Intel. Seems somewhat standardized [[http://www.sandpile.org/x86/cpuid.htm|[1]]] and would be useful to have cache layout. * Is there a desire for lazily-loaded facts? Rich data model, while offsetting cost for harvesting information (e.g. inter-core RTT microbenchmarks). = References = [1] Advanced Configuration and Power Interface Revision 2.0c, http://www.acpi.info/DOWNLOADS/ACPIspec-2-0c.pdf