百科问答小站 logo
百科问答小站 font logo



C++ 的 sizeof 是怎么实现的? 第1页

  

user avatar   lan-se-52-30 网友的相关建议: 
      

sizeof的东西会被编译器直接替换掉,即使是汇编代码都只能看到一个常量,所以下面有童鞋说看反汇编源码是不行的,因为已经在编译器内部替换掉了(更严谨的说法是,VLA是特殊情况,这是后面的代码说明中有提到)。下面以Clang对sizeof的处理来看sizeof的实现。

在Clang的实现中,在lib/AST/ExprConstant.cpp中有这样的方法:

       bool IntExprEvaluator::VisitUnaryExprOrTypeTraitExpr      

这个方法的实现如此:

       switch(E->getKind()) {   case UETT_AlignOf: {     if (E->isArgumentType())       return Success(GetAlignOfType(Info, E->getArgumentType()), E);     else       return Success(GetAlignOfExpr(Info, E->getArgumentExpr()), E);   }    case UETT_VecStep: {     QualType Ty = E->getTypeOfArgument();      if (Ty->isVectorType()) {       unsigned n = Ty->castAs<VectorType>()->getNumElements();        // The vec_step built-in functions that take a 3-component       // vector return 4. (OpenCL 1.1 spec 6.11.12)       if (n == 3)         n = 4;        return Success(n, E);     } else       return Success(1, E);   }    case UETT_SizeOf: {     QualType SrcTy = E->getTypeOfArgument();     // C++ [expr.sizeof]p2: "When applied to a reference or a reference type,     //   the result is the size of the referenced type."     if (const ReferenceType *Ref = SrcTy->getAs<ReferenceType>())       SrcTy = Ref->getPointeeType();      CharUnits Sizeof;     if (!HandleSizeof(Info, E->getExprLoc(), SrcTy, Sizeof))       return false;     return Success(Sizeof, E);   }   }    llvm_unreachable("unknown expr/type trait"); }       

然后通过这个方法,我们可以顺藤摸瓜,发现sizeof的处理其实是在HandleSizeof这个方法内,结果是会存储在Sizeof这个CharUnits中,而一个CharUnits是Clang内部的一个表示,引用Clang的注释如下

         /// CharUnits - This is an opaque type for sizes expressed in character units.   /// Instances of this type represent a quantity as a multiple of the size   /// of the standard C type, char, on the target architecture. As an opaque   /// type, CharUnits protects you from accidentally combining operations on   /// quantities in bit units and character units.   ///   /// In both C and C++, an object of type 'char', 'signed char', or 'unsigned   /// char' occupies exactly one byte, so 'character unit' and 'byte' refer to   /// the same quantity of storage. However, we use the term 'character unit'   /// rather than 'byte' to avoid an implication that a character unit is   /// exactly 8 bits.   ///   /// For portability, never assume that a target character is 8 bits wide. Use   /// CharUnit values wherever you calculate sizes, offsets, or alignments   /// in character units.     

然后,我们找寻HandleSizeof方法:

       /// Get the size of the given type in char units. static bool HandleSizeof(EvalInfo &Info, SourceLocation Loc,                          QualType Type, CharUnits &Size) {   // sizeof(void), __alignof__(void), sizeof(function) = 1 as a gcc   // extension.   if (Type->isVoidType() || Type->isFunctionType()) {     Size = CharUnits::One();     return true;   }    if (!Type->isConstantSizeType()) {     // sizeof(vla) is not a constantexpr: C99 6.5.3.4p2.     // FIXME: Better diagnostic.     Info.Diag(Loc);     return false;   }    Size = Info.Ctx.getTypeSizeInChars(Type);   return true; }      

走到这里,我们就知道了为什么会被替换掉了,如你这里是void或者Function type,编译器都直接替换为CharUnits::One()这个常量(即一个Char的大小),所以这就是汇编也只能看到常量的原因,毕竟汇编是后面CodeGen的事情,而这里是在CodeGen之前发生的了。而在这里也会判断Type是不是ConstantSizeType,因为需要在编译期计算出来,而注释则是针对VLA,有兴趣的同学可以按照注释的C99地方去看说的是什么。接下来则是把Type传给getTypeSizeInChars方法了。

OK,接下来我们再一步一步的走下去,看getTypeSizeInChars做了什么。

       /// getTypeSizeInChars - Return the size of the specified type, in characters. /// This method does not work on incomplete types. CharUnits ASTContext::getTypeSizeInChars(QualType T) const {   return getTypeInfoInChars(T).first; }      

走到这里的时候,虽然我们就算不走下去都能知道这个方法是返回特定类型的大小了,但是我们还是要打破沙锅问到底,看到底是怎么实现的。于是我们继续走getTypeInfoChars()这个方法。

       std::pair<CharUnits, CharUnits> ASTContext::getTypeInfoInChars(QualType T) const {   return getTypeInfoInChars(T.getTypePtr()); }      

走到这里,我们也知道为什么会有first了,因为这个方法返回的是一个std::pair,接下来我们可以发现调用的还是getTypeInChar方法,但是参数一个TypePointers,于是我们找这个重载方法:

       std::pair<CharUnits, CharUnits> ASTContext::getTypeInfoInChars(const Type *T) const {   if (const ConstantArrayType *CAT = dyn_cast<ConstantArrayType>(T))     return getConstantArrayInfoInChars(*this, CAT);   TypeInfo Info = getTypeInfo(T);   return std::make_pair(toCharUnitsFromBits(Info.Width),                         toCharUnitsFromBits(Info.Align)); }      

随后,我们可以发现是getTypeInfo这个方法,然后我们找到对应的代码:

       TypeInfo ASTContext::getTypeInfo(const Type *T) const {   TypeInfoMap::iterator I = MemoizedTypeInfo.find(T);   if (I != MemoizedTypeInfo.end())     return I->second;    // This call can invalidate MemoizedTypeInfo[T], so we need a second lookup.   TypeInfo TI = getTypeInfoImpl(T);   MemoizedTypeInfo[T] = TI;   return TI; }      

然后我们找到了这个,对于MemorizedTypeInfo我们暂时不需要关心,我们也能发现需要的东西其实在getTypeInfoImpl里面

       /// getTypeInfoImpl - Return the size of the specified type, in bits.  This /// method does not work on incomplete types. /// /// FIXME: Pointers into different addr spaces could have different sizes and /// alignment requirements: getPointerInfo should take an AddrSpace, this /// should take a QualType, &c. TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const {   uint64_t Width = 0;   unsigned Align = 8;   bool AlignIsRequired = false;   switch (T->getTypeClass()) { #define TYPE(Class, Base) #define ABSTRACT_TYPE(Class, Base) #define NON_CANONICAL_TYPE(Class, Base) #define DEPENDENT_TYPE(Class, Base) case Type::Class: #define NON_CANONICAL_UNLESS_DEPENDENT_TYPE(Class, Base)                          case Type::Class:                                                               assert(!T->isDependentType() && "should not see dependent types here");         return getTypeInfo(cast<Class##Type>(T)->desugar().getTypePtr()); #include "clang/AST/TypeNodes.def"     llvm_unreachable("Should not see dependent types");    case Type::FunctionNoProto:   case Type::FunctionProto:     // GCC extension: alignof(function) = 32 bits     Width = 0;     Align = 32;     break;    case Type::IncompleteArray:   case Type::VariableArray:     Width = 0;     Align = getTypeAlign(cast<ArrayType>(T)->getElementType());     break;    case Type::ConstantArray: {     const ConstantArrayType *CAT = cast<ConstantArrayType>(T);      TypeInfo EltInfo = getTypeInfo(CAT->getElementType());     uint64_t Size = CAT->getSize().getZExtValue();     assert((Size == 0 || EltInfo.Width <= (uint64_t)(-1) / Size) &&            "Overflow in array type bit size evaluation");     Width = EltInfo.Width * Size;     Align = EltInfo.Align;     if (!getTargetInfo().getCXXABI().isMicrosoft() ||         getTargetInfo().getPointerWidth(0) == 64)       Width = llvm::RoundUpToAlignment(Width, Align);     break;   }   case Type::ExtVector:   case Type::Vector: {     const VectorType *VT = cast<VectorType>(T);     TypeInfo EltInfo = getTypeInfo(VT->getElementType());     Width = EltInfo.Width * VT->getNumElements();     Align = Width;     // If the alignment is not a power of 2, round up to the next power of 2.     // This happens for non-power-of-2 length vectors.     if (Align & (Align-1)) {       Align = llvm::NextPowerOf2(Align);       Width = llvm::RoundUpToAlignment(Width, Align);     }     // Adjust the alignment based on the target max.     uint64_t TargetVectorAlign = Target->getMaxVectorAlign();     if (TargetVectorAlign && TargetVectorAlign < Align)       Align = TargetVectorAlign;     break;   }    case Type::Builtin:     switch (cast<BuiltinType>(T)->getKind()) {     default: llvm_unreachable("Unknown builtin type!");     case BuiltinType::Void:       // GCC extension: alignof(void) = 8 bits.       Width = 0;       Align = 8;       break;      case BuiltinType::Bool:       Width = Target->getBoolWidth();       Align = Target->getBoolAlign();       break;     case BuiltinType::Char_S:     case BuiltinType::Char_U:     case BuiltinType::UChar:     case BuiltinType::SChar:       Width = Target->getCharWidth();       Align = Target->getCharAlign();       break;     case BuiltinType::WChar_S:     case BuiltinType::WChar_U:       Width = Target->getWCharWidth();       Align = Target->getWCharAlign();       break;     case BuiltinType::Char16:       Width = Target->getChar16Width();       Align = Target->getChar16Align();       break;     case BuiltinType::Char32:       Width = Target->getChar32Width();       Align = Target->getChar32Align();       break;     case BuiltinType::UShort:     case BuiltinType::Short:       Width = Target->getShortWidth();       Align = Target->getShortAlign();       break;     case BuiltinType::UInt:     case BuiltinType::Int:       Width = Target->getIntWidth();       Align = Target->getIntAlign();       break;     case BuiltinType::ULong:     case BuiltinType::Long:       Width = Target->getLongWidth();       Align = Target->getLongAlign();       break;     case BuiltinType::ULongLong:     case BuiltinType::LongLong:       Width = Target->getLongLongWidth();       Align = Target->getLongLongAlign();       break;     case BuiltinType::Int128:     case BuiltinType::UInt128:       Width = 128;       Align = 128; // int128_t is 128-bit aligned on all targets.       break;     case BuiltinType::Half:       Width = Target->getHalfWidth();       Align = Target->getHalfAlign();       break;     case BuiltinType::Float:       Width = Target->getFloatWidth();       Align = Target->getFloatAlign();       break;     case BuiltinType::Double:       Width = Target->getDoubleWidth();       Align = Target->getDoubleAlign();       break;     case BuiltinType::LongDouble:       Width = Target->getLongDoubleWidth();       Align = Target->getLongDoubleAlign();       break;     case BuiltinType::NullPtr:       Width = Target->getPointerWidth(0); // C++ 3.9.1p11: sizeof(nullptr_t)       Align = Target->getPointerAlign(0); //   == sizeof(void*)       break;     case BuiltinType::ObjCId:     case BuiltinType::ObjCClass:     case BuiltinType::ObjCSel:       Width = Target->getPointerWidth(0);        Align = Target->getPointerAlign(0);       break;     case BuiltinType::OCLSampler:       // Samplers are modeled as integers.       Width = Target->getIntWidth();       Align = Target->getIntAlign();       break;     case BuiltinType::OCLEvent:     case BuiltinType::OCLImage1d:     case BuiltinType::OCLImage1dArray:     case BuiltinType::OCLImage1dBuffer:     case BuiltinType::OCLImage2d:     case BuiltinType::OCLImage2dArray:     case BuiltinType::OCLImage3d:       // Currently these types are pointers to opaque types.       Width = Target->getPointerWidth(0);       Align = Target->getPointerAlign(0);       break;     }     break;   case Type::ObjCObjectPointer:     Width = Target->getPointerWidth(0);     Align = Target->getPointerAlign(0);     break;   case Type::BlockPointer: {     unsigned AS = getTargetAddressSpace(         cast<BlockPointerType>(T)->getPointeeType());     Width = Target->getPointerWidth(AS);     Align = Target->getPointerAlign(AS);     break;   }   case Type::LValueReference:   case Type::RValueReference: {     // alignof and sizeof should never enter this code path here, so we go     // the pointer route.     unsigned AS = getTargetAddressSpace(         cast<ReferenceType>(T)->getPointeeType());     Width = Target->getPointerWidth(AS);     Align = Target->getPointerAlign(AS);     break;   }   case Type::Pointer: {     unsigned AS = getTargetAddressSpace(cast<PointerType>(T)->getPointeeType());     Width = Target->getPointerWidth(AS);     Align = Target->getPointerAlign(AS);     break;   }   case Type::MemberPointer: {     const MemberPointerType *MPT = cast<MemberPointerType>(T);     std::tie(Width, Align) = ABI->getMemberPointerWidthAndAlign(MPT);     break;   }   case Type::Complex: {     // Complex types have the same alignment as their elements, but twice the     // size.     TypeInfo EltInfo = getTypeInfo(cast<ComplexType>(T)->getElementType());     Width = EltInfo.Width * 2;     Align = EltInfo.Align;     break;   }   case Type::ObjCObject:     return getTypeInfo(cast<ObjCObjectType>(T)->getBaseType().getTypePtr());   case Type::Adjusted:   case Type::Decayed:     return getTypeInfo(cast<AdjustedType>(T)->getAdjustedType().getTypePtr());   case Type::ObjCInterface: {     const ObjCInterfaceType *ObjCI = cast<ObjCInterfaceType>(T);     const ASTRecordLayout &Layout = getASTObjCInterfaceLayout(ObjCI->getDecl());     Width = toBits(Layout.getSize());     Align = toBits(Layout.getAlignment());     break;   }   case Type::Record:   case Type::Enum: {     const TagType *TT = cast<TagType>(T);      if (TT->getDecl()->isInvalidDecl()) {       Width = 8;       Align = 8;       break;     }      if (const EnumType *ET = dyn_cast<EnumType>(TT)) {       const EnumDecl *ED = ET->getDecl();       TypeInfo Info =           getTypeInfo(ED->getIntegerType()->getUnqualifiedDesugaredType());       if (unsigned AttrAlign = ED->getMaxAlignment()) {         Info.Align = AttrAlign;         Info.AlignIsRequired = true;       }       return Info;     }      const RecordType *RT = cast<RecordType>(TT);     const RecordDecl *RD = RT->getDecl();     const ASTRecordLayout &Layout = getASTRecordLayout(RD);     Width = toBits(Layout.getSize());     Align = toBits(Layout.getAlignment());     AlignIsRequired = RD->hasAttr<AlignedAttr>();     break;   }    case Type::SubstTemplateTypeParm:     return getTypeInfo(cast<SubstTemplateTypeParmType>(T)->                        getReplacementType().getTypePtr());    case Type::Auto: {     const AutoType *A = cast<AutoType>(T);     assert(!A->getDeducedType().isNull() &&            "cannot request the size of an undeduced or dependent auto type");     return getTypeInfo(A->getDeducedType().getTypePtr());   }    case Type::Paren:     return getTypeInfo(cast<ParenType>(T)->getInnerType().getTypePtr());    case Type::Typedef: {     const TypedefNameDecl *Typedef = cast<TypedefType>(T)->getDecl();     TypeInfo Info = getTypeInfo(Typedef->getUnderlyingType().getTypePtr());     // If the typedef has an aligned attribute on it, it overrides any computed     // alignment we have.  This violates the GCC documentation (which says that     // attribute(aligned) can only round up) but matches its implementation.     if (unsigned AttrAlign = Typedef->getMaxAlignment()) {       Align = AttrAlign;       AlignIsRequired = true;     } else {       Align = Info.Align;       AlignIsRequired = Info.AlignIsRequired;     }     Width = Info.Width;     break;   }    case Type::Elaborated:     return getTypeInfo(cast<ElaboratedType>(T)->getNamedType().getTypePtr());    case Type::Attributed:     return getTypeInfo(                   cast<AttributedType>(T)->getEquivalentType().getTypePtr());    case Type::Atomic: {     // Start with the base type information.     TypeInfo Info = getTypeInfo(cast<AtomicType>(T)->getValueType());     Width = Info.Width;     Align = Info.Align;      // If the size of the type doesn't exceed the platform's max     // atomic promotion width, make the size and alignment more     // favorable to atomic operations:     if (Width != 0 && Width <= Target->getMaxAtomicPromoteWidth()) {       // Round the size up to a power of 2.       if (!llvm::isPowerOf2_64(Width))         Width = llvm::NextPowerOf2(Width);        // Set the alignment equal to the size.       Align = static_cast<unsigned>(Width);     }   }          

一切真相大白了,已不需要解释了 :-)




  

相关话题

  有哪些关于C++高性能服务器开发的高质量博客? 
  我怎样成为@vczh一样的大神? 
  如何看待 Rust 这门语言? 
  如何理解 “c++缺少对象级别的消息发送机制” 这句话? 
  C# 和 Java 哪个更像 C++? 
  面向对象程序设计比传统的面向过程程序设计更有什么好处? 
  const TYPE * 究竟限制的是什么? 
  有什么像a=a+b;b=a-b;a=a-b;这样的算法或者知识? 
  C 与 C++ 的真正区别在哪里? 
  cpp是不是最方便研究算法的语言? 

前一个讨论
生物体可能产生金属单质吗,或者产生半导体,构建逻辑电路?
下一个讨论
暴力驾驶能对汽车的寿命究竟造成多大伤害?





© 2024-11-22 - tinynew.org. All Rights Reserved.
© 2024-11-22 - tinynew.org. 保留所有权利